BERND ENGELMANN 
ROBERT RAUHMEIER 
Editors 


The Basel Il 
Risk Parameters 


Estimation, Validation, Stress Testing - 
with Applications to Loan Risk Management 


G) Springer 


The Basel II Risk Parameters 


Second edition 


Bernd Engelmann - Robert Rauhmeier 
Editors 


The Basel H Risk Parameters 


Estimation, Validation, Stress Testing — 
with Applications to Loan Risk Management 


A Springer 


Editors 


Dr. Bernd Engelmann Dr. Robert Rauhmeier 
bernd.engelmann@quantsolutions.de robert.rauhmeier@arcor.de 
ISBN 978-3-642-16113-1 e-ISBN 978-3-642-16114-8 


DOI 10.1007/978-3-642-16114-8 
Springer Heidelberg Dordrecht London New York 


Library of Congress Control Number: 2011924881 


© Springer-Verlag Berlin Heidelberg 2006, 2011 

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, 
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 
1965, in its current version, and permission for use must always be obtained from Springer. Violations 
are liable to prosecution under the German Copyright Law. 

The use of general descriptive names, registered names, trademarks, etc. in this publication does 
not imply, even in the absence of a specific statement, that such names are exempt from the relevant 
protective laws and regulations and therefore free for general use. 


Cover design: WMXDesign GmbH, Heidelberg, Germany 
Printed on acid-free paper 


Springer is part of Springer Science+Business Media (www.springer.com) 


Preface to the Second Edition 


The years after the first edition of this book appeared have been very turbulent. We 
have seen one of the largest financial crisis in the history of the global financial 
system. Banks which existed since more than one century have disappeared or had 
to be rescued by the state. Although Basel II has been implemented by many banks 
so far and still a lot of effort is spent in improving credit risk management by 
building up rating systems and procedures for estimating the loan loss parameters 
PD, LGD, and EAD, there is still a feeling that this is insufficient to prevent the 
financial system from further crisis. 

There are ongoing discussions how the financial system can be stabilized by either 
improving the regulatory framework or the internal risk management of banks. 
During the time when we worked on this second edition, the regulatory framework 
Basel III has been discussed. The basic idea behind Basel III is extending the capital 
basis of banks. It is not the aim of Basel III to improve the methods and processes of 
banks’ internal credit risk management but simply to improve system stability by 
increasing capital buffers. Since we did not view this book as a book on regulation 
(although it was motivated by a regulatory framework) but as a book on risk 
management, we do not discuss the current regulatory ideas in this edition. 

Instead, we focus on one of the causes for the financial crisis, the lending 
behaviour of banks in the retail sector. By retail, we mean lending to debtors 
where no market information on their credit quality, like asset swap or credit 
default swap spreads, is available. This is the case for almost all loans except 
for loans to large corporations, states or banks. In the recent financial crisis one 
of the origins was that American banks granted mortagage loans to too many 
debtors with low income. By assuming that house prices could not fall sharply it 
was thought that the value of the loan’s collateral will be sufficient in the case of 
a default to ensure that no loss occurs. A large number of bankruptcies among 
the banks which had invested in the American housing sector and expensive 
rescue programs for banks that were considered as too important to fail are the 
result of this wrong assumption. 

The consequences of the financial crisis are not yet clear. The question how an 
optimal financial system has to look like is difficult to answer. On the one hand the 
lending behaviour of banks should not be too restrictive because this will obstruct 
the real economy. On the other hand it must be restrictive enough to prevent the 
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creation of bubbles. The same considerations are true for the spectrum of financial 
products. There should be enough vehicles for banks and corporations to manage 
their risks but the complexity and the volume of derivative instruments should not 
lead to a less stable financial system. 

We do not attempt to give an answer to this complex question. Contrary to some 
opinions in the aftermath of the crisis that blamed mathematical models as its main 
driver, we still believe that mathematics and statistics are valuable tools to quantify 
risks. However, one has to be aware that this cannot be done with arbitrary 
precision. The role of a model in our view is more to increase the transparency of 
a bank’s business and to identify key risks. We want to illustrate this view by 
presenting a pricing framework for retail loans that shows how the Basel II risk 
parameters can be used in building a simple and transparent framework for the 
pricing and the risk management of loan portfolios. In our view an increase in 
transparency in the loan market is a necessary prerequisite of any risk management 
or regulatory action. 

Compared to the first edition, we have extended the book by three new chapters. 
In Chap.6 estimation techniques for transition matrices are presented and their 
properties are discussed. A transition matrix is a natural extension of a 1-year 
default probability since it measures all transitions of a rating system not only the 
transitions to default. It is an important building block of the loan pricing frame- 
work that is presented in Chaps.17 and 18. In Chap.17 it is shown how the Basel II 
risk parameters can be used to build a risk-adjusted pricing framework for loans that 
can be applied to compute a loan’s term based on RAROC (risk-adjusted return on 
capital) as performance measure and to calculate general loss provisions for a loan 
portfolio in an economically sensible way. Furthermore, this framework allows for 
an easy stress testing and answering of questions like: “What happens if the value of 
collateral turns out to be 10% lower than assumed?” In Chap.18, the pricing 
framework is extended in a consistent way to loans with embedded options using 
option pricing theory. Often a loan contains prepayment rights, i.e. a debtor has the 
right to pay back parts or all of the notional at certain dates or throughout the loan’s 
lifetime without penalty. We demonstrate that the value of such an option is too 
large to be neglected and show further how to include embedded options into the 
RAROC framework of Chap.17. 

Finally, we would like to thank Martina Bihn from Springer-Verlag again for her 
support of this second edition and last but not least our families for their support 
when we again spent a lot of time working on it. 

Questions and comments on this book are welcome. The editors can be 
reached under their e-mail addresses bernd.engelmann@quantsolutions.de and 
robert.rauhmeier@arcor.de. 


Frankfurt am Main, Germany Bernd Engelmann 
Munich, Germany Robert Rauhmeier 
December 2010 


Preface to the First Edition 


In the last decade the banking industry has experienced a significant development in 
the understanding of credit risk. Refined methods were proposed concerning the 
estimation of key risk parameters like default probabilities. Further, a large volume 
of literature on the pricing and measurement of credit risk in a portfolio context has 
evolved. This development was partly reflected by supervisors when they agreed on 
the new revised capital adequacy framework, Basel II. Under Basel II, the level of 
regulatory capital depends on the risk characteristics of each credit while a portfolio 
context is still neglected. 

The focus of this book is on the estimation and validation of the three key 
Basel II risk parameters, probability of default (PD), loss given default (LGD), 
and exposure at default (EAD). Since the new regulatory framework will become 
operative in January 2007 (at least in Europe), many banks are in the final stages of 
implementation. Many questions have arisen during the implementation phase and 
are discussed by practitioners, supervisors, and academics. A “best practice” 
approach has to be formed and will be refined in the future even beyond 2007. 
With this book we aim to contribute to this process. Although the book is inspired 
by the new capital framework, we hope that it is valuable in a broader context. The 
three risk parameters are central inputs to credit portfolio models or credit pricing 
algorithms and their correct estimation is therefore essential for internal bank 
controlling and management. 

This is not a book about the Basel II framework. There is already a large volume 
of literature explaining the new regulation at length. Rather, we attend to the current 
state-of-the-art of quantitative and qualitative approaches. The book is a combina- 
tion of coordinated stand-alone articles, arranged into 15 chapters so that each 
chapter can be read exclusively. The authors are all experts from science, supervi- 
sory authorities, and banking practice. The book is divided into three main parts: 
Estimation techniques for the parameters PD, LGD and EAD, validation of these 
parameters, and stress testing. 

The first part begins with an overview of the popular and established methods for 
estimating PD. Chapter 2 focuses on methods for PD estimation for small and 
medium sized corporations while Chap.3 treats the PD estimation for the retail 
segment. Chapters 4 and 5 deal with those segments with only a few or even no 
default data, as it is often the case in the large corporate, financial institutions, 
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or sovereign segment. Chapter 4 illustrates how PD can be estimated with the 
shadow rating approach while Chap.5 uses techniques from probability theory. 
Chapter 6 describes how PDs and Recovery Rates could be estimated under 
considerations of systematic and idiosyncratic risk factors simultaneously. This is 
a perfect changeover to the chaps.7—-10 dealing with LGD and EAD estimation 
which is quite new in practice compared to ratings and PD estimation. Chapter 7 
describes how LGD could be modelled in a point-in-time framework as a function 
of risk drivers, supported by an empirical study on bond data. Chapter 8 provides a 
general survey of LGD estimation from a practical point of view. Chapters 9 and 10 
are concerned with the modelling of EAD. Chapter 9 provides a general overview 
of EAD estimation techniques while Chap.10 focuses on the estimation of EAD for 
facilities with explicit limits. 

The second part of the book consists of four chapters about validation and 
statistical back-testing of rating systems. Chapter 11 deals with the perspective 
of the supervisory authorities and gives a glance as to what is expected when rating 
systems will be used under the Baselll framework. Chapter 12 has a critical 
discussion on measuring the discriminatory power of rating systems. Chapter 13 
gives an overview of statistical tests for the dimension calibration, i.e. the accuracy 
of PD estimations. In Chap.14 these methods are enhanced by techniques of Monte- 
Carlo-Simulations which allows e.g. for integration of correlation assumptions as is 
also illustrated within a back-testing study on a real-life rating data sample. 

The final part consists of Chap.15, which is on stress testing. The purpose of 
stress testing is to detect limitations of models for the risk parameters and to analyse 
effects of (extreme) worse scenarios in the future on a bank’s portfolio. Concepts 
and implementation strategies of stress test are explained and a simulation study 
reveals amazing effects of stress scenarios when calculating economic capital with 
a portfolio model. 

All articles set great value on practical applicability and mostly include empirical 
studies or work with examples. Therefore we regard this book as a valuable contri- 
bution towards modern risk management in every financial institution, whereas we 
steadily keep track on the requirements of Basel II. The book is addressed to risk 
managers, rating analyst and in general quantitative analysts who work in the credit 
risk area or on regulatory issues. Furthermore, we target internal auditors and super- 
visors who have to evaluate the quality of rating systems and risk parameter estima- 
tions. We hope that this book will deepen their understanding and will be useful for 
their daily work. Last but not least we hope this book will also be of interest to 
academics or students in finance or economics who want to get an overview of the 
state-of-the-art of a currently important topic in the banking industry. 

Finally, we have to thank all the people who made this book possible. Our 
sincere acknowledgements go to all the contributors of this book for their work, 
their enthusiasm, their reliability, and their cooperation. We know that most of the 
writing had to be done in valuable spare time. We are glad that all of them were 
willing to make such sacrifices for the sake of this book. Special thank goes to 
Walter Gruber for bringing us on the idea to edit this book. 
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We are grateful to Martina Bihn from Springer-Verlag who welcomed our idea 
for this book and supported our work on it. 

We thank Dresdner Bank AG, especially Peter Gassmann and Dirk Thomas, and 
Quanteam AG for supporting our book. Moreover we are grateful to all our 
colleagues and friends who agreed to work as referees or discussion partners. 

Finally we would like to thank our families for their continued support and 
understanding. 


Frankfurt am Main, Germany Bernd Engelmann 
Munich, Germany Robert Rauhmeier 
June 2006 
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Chapter 1 
Statistical Methods to Develop Rating Models 


Evelyn Hayden and Daniel Porath 


1.1 Introduction 


The Internal Rating Based Approach (IRBA) of the New Basel Capital Accord 
allows banks to use their own rating models for the estimation of probabilities of 
default (PD) as long as the systems meet specified minimum requirements. Statistical 
theory offers a variety of methods for building and estimation rating models. This 
chapter gives an overview of these methods. The overview is focused on statistical 
methods and includes parametric models like linear regression analysis, discriminant 
analysis, binary response analysis, time-discrete panel methods, hazard models and 
nonparametric models like neural networks and decision trees. We also highlight the 
benefits and the drawbacks of the various approaches. We conclude by interpreting 
the models in light of the minimum requirements of the IRBA. 


1.2 Statistical Methods for Risk Classification 


In the following we define statistical models as the class of approach which uses 
econometric methods to classify borrowers according to their risk. Statistical rating 
systems primarily involve a search for explanatory variables which provide as 
sound and reliable a forecast of the deterioration of a borrower’s situation as 
possible. In contrast, structural models explain the threats to a borrower based on 
an economic model and thus use clear causal connections instead of the mere 
correlation of variables. 
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of Raiffeisen Bank International. 
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The following sections offer an overview of parametric and nonparametric 
models generally considered for statistical risk assessment. Furthermore, we dis- 
cuss the advantages and disadvantages of each approach. Many of the methods are 
described in more detail in standard econometric textbooks, like Greene (2003). 

In general, a statistical model may be described as follows: As a starting point, 
every statistical model uses the borrower’s characteristic indicators and (possibly) 
macroeconomic variables which were collected historically and are available for 
defaulting (or troubled) and non-defaulting borrowers. Let the borrower’s charac- 
teristics be defined by a vector of n separate variables (also called covariates) 
X = X1,...,X, observed at time t — L. The state of default is indicated by a binary 
performance variable y observed at time t. The variable y is defined as y = 1 fora 
default and y = 0 for a non-default. 

The sample of borrowers now includes a number of individuals or firms that 
defaulted in the past, while (typically) the majority did not default. Depending on the 
statistical application of this data, a variety of methods can be used to predict the 
performance. A common feature of the methods is that they estimate the correlation 
between the borrowers’ characteristics and the state of default in the past and use this 
information to build a forecasting model. The forecasting model is designed to assess 
the creditworthiness of borrowers with unknown performance. This can be done by 
inputting the characteristics x into the model. The output of the model is the estimated 
performance. The time lag L between x and y determines the forecast horizon. 


1.3 Regression Analysis 


As a starting point we consider the classical regression model. The regression 
model establishes a linear relationship between the borrowers’ characteristics and 
the default variable: 


yi = Box; tu; (1.1) 


Again, y; indicates whether borrower i has defaulted (y; = 1) or not (y; = 0). In 
period ¢, x; is a column vector of the borrowers’ characteristics observed in period 
t — Land £ is a column vector of parameters which capture the impact of a change 
in the characteristics on the default variable. Finally, u; is the residual variable 
which contains the variation not captured by the characteristics x;. 

The standard procedure is to estimate (1.1) with the ordinary least squares (OLS) 
estimators of B which in the following are denoted by b. The estimated result is the 
borrower’s score S; This can be calculated by 


Si = E(y;|x;) = b' Xi. (1.2) 


Equation (1.2) shows that a borrower’s score represents the expected value of the 
performance variable when his or her individual characteristics are known. 
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The score can be calculated by inputting the values for the borrower’s character- 
istics into the linear function given in (1.2). 

Note that S; is continuous (while y; is a binary variable), hence the output of the 
model will generally be different from 0 or 1. In addition, the prediction can take on 
values larger than 1 or smaller than 0. As a consequence, the outcome of the model 
cannot be interpreted as a probability level. However, the score $;, can be used for 
the purpose of comparison between different borrowers, where higher values of S; 
correlate with a higher default risk. 

The benefits and drawbacks from model (1.1) and (1.2) are the following: 


e OLS estimators are well-known and easily available. 

e The forecasting model is a linear model and therefore easy to compute and to 
understand. 

e The random variable u; is heteroscedastic (i.e. the variance of u; is not constant 
for all 7) since 


Var (u;) = Var (yi) = E(y;|x;) i [1 = E(yilx;)] = b' i x;(1 = b' i xi). (1.3) 


As a consequence, the estimation of P is inefficient and additionally, the 
standard errors of the estimated coefficients b are biased. An efficient way to 
estimate B is to apply the Weighted Least Squares (WLS) estimator. 

e WLS estimation of p is efficient, but the estimation of the standard errors of b 
still remains biased. This happens due to the fact that the residuals are not 
normally distributed as they can only take on the values b’x; (if the borrower 
does not default and y therefore equals 0) or (1 — b’x,) (if the borrower does 
default and y therefore equals 1). This implies that there is no reliable way to 
assess the significance of the coefficients b and it remains unknown whether the 
estimated values represent precise estimations of significant relationships or 
whether they are just caused by spurious correlations. Inputting characteristics 
which are not significant into the model can seriously harm the model’s stability 
when used to predict borrowers’ risk for new data. A way to cope with this 
problem is to split the sample into two parts, where one part (the training sample) 
is used to estimate the model and the other part (the hold-out sample) is used to 
validate the results. The consistency of the results of both samples is then taken 
as an indicator for the stability of the model. 

e The absolute value of S; cannot be interpreted. 


1.4 Discriminant Analysis 


Discriminant analysis is a classification technique applied to corporate bankruptcies 
by Altman as early as 1968 (see Altman 1968). Linear discriminant analysis is 
based on the estimation of a linear discriminant function with the task of separating 
individual groups (in this case of defaulting and non-defaulting borrowers) accord- 
ing to specific characteristics. The discriminant function is 
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Si = P' + xi. (1.4) 


The Score S$; is also called the discriminant variable. The estimation of the 
discriminant function adheres to the following principle: 


Maximization of the spread between the groups (good and bad borrowers) and minimiza- 
tion of the spread within individual groups 


Maximization only determines the optimal proportions among the coefficients of 
the vector B. Usually (but arbitrarily), coefficients are normalized by choosing the 
pooled within-group variance to take the value 1. As a consequence, the absolute 
level of S; is arbitrary as well and cannot be interpreted on a stand-alone basis. As in 
linear regression analysis, S; can only be used to compare the prediction for 
different borrowers (“higher score, higher risk”). 

Discriminant analysis is similar to the linear regression model given in (1.1) and 
(1.2). In fact, the proportions among the coefficients of the regression model are 
equal to the optimal proportion according to the discriminant analysis. The diffe- 
rence between the two methods is a theoretical one: Whereas in the regression 
model the characteristics are deterministic and the default state is the realization of 
a random variable, for discriminant analysis the opposite is true. Here the groups 
(default or non-default) are deterministic and the characteristics of the discriminant 
function are realizations from a random variable. For practical use this difference is 
virtually irrelevant. 

Therefore, the benefits and drawbacks of discriminant analysis are similar to 
those of the regression model: 


e Discriminant analysis is a widely known method with estimation algorithms that 
are easily available. 

e Once the coefficients are estimated, the scores can be calculated in a straight- 
forward way with a linear function. 

e Since the characteristics x; are assumed to be realizations of random variables, 
the statistical tests for the significance of the model and the coefficients rely on 
the assumption of multivariate normality. This is, however, unrealistic for the 
variables typically used in rating models as for example financial ratios from the 
balance-sheet. Hence, the methods for analyzing the stability of the model and 
the plausibility of the coefficients are limited to a comparison between training 
and hold-out sample. 

e The absolute value of the discriminant function cannot be interpreted in levels. 


1.5 Logit and Probit Models 


Logit and probit models are econometric techniques designed for analyzing binary 
dependent variables. There are two alternative theoretical foundations. 

The latent-variable approach assumes an unobservable (latent) variable y* which 
is related to the borrower’s characteristics in the following way: 
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y; =B oxi + ui (1.5) 


Here ß, x; and u; are defined as above. The variable y,* is metrically scaled and 
triggers the value of the binary default variable y;: 


O fi if yf>0 
i = a otherwise (1.6) 


This means that the default event sets in when the latent variable exceeds the 
threshold zero. Therefore, the probability for the occurrence of the default event 
equals: 


P(yi = 1) = P(uj> —p - Xj) =1 — F(-p' - xi) = F(p xi). (1.7) 


Here F(.) denotes the (unknown) distribution function. The last step in (1.7) 
assumes that the distribution function has a symmetric density around zero. The 
choice of the distribution function F(.) depends on the distributional assumptions 
about the residuals (u;). If a normal distribution is assumed, we are faced with the 
probit model: 


Bix; 


FB) == | edt (1.8) 


—oco 


If instead the residuals are assumed to follow a logistic distribution, the result is 
the logit model: 


eB xi 


Fp’ - Xj) METZ 


(1.9) 

The second way to motivate logit and probit models starts from the aim of 
estimating default probabilities. For single borrowers, default probabilities cannot 
be observed as realizations of default probabilities. However, for groups of bor- 
rowers the observed default frequencies can be interpreted as default probabilities. 
As a starting point consider the OLS estimation of the following regression: 


pi =b -xi + ui (1.10) 


In (1.10) the index i denotes the group formed by a number of individuals, p; is 
the default frequency observed in group i and x; are the characteristics observed for 
group i. The model, however, is inadequate. To see this consider that the outcome 
(which is E(y;/x;) = b’x;) is not bounded to values between zero and one and 
therefore cannot be interpreted as a probability. As it is generally implausible to 
assume that a probability can be calculated by a linear function, in a second step the 
linear expression b’x; is transformed by a nonlinear function (link function) F: 


pi = F(b' - xj). (1.11) 
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An appropriate link function transforms the values of b’x; to a scale within the 
interval [0,1]. This can be achieved by any distribution function. The choice of the link 
function determines the type of model: with a logistic link function (1.11) becomes a 
logit model, while with the normal distribution (1.11) results in the probit model. 

However, when estimating (1.10) with OLS, the coefficients will be heteroscedas- 
tic, because Var(u;) = Var(p;) = p(x- (1—p(x;)). A possible way to achieve homo- 
scedasticity would be to compute the WLS estimators of b in (1.10). However, albeit 
possible, this is not common practice. The reason is that in order to observe default 
frequencies, the data has to be grouped before estimation. Grouping involves consid- 
erable practical problems like defining the size and number of the groups and the 
treatment of different covariates within the single groups. A better way to estimate 
logit and probit models, which does not require grouping, is the Maximum-Likelihood 
(ML) method. For a binary dependent variable the likelihood function looks like: 


L(b) = [ [P0 x)” [1 — P(e! xi). (1.12) 


For the probit model P(.) is the normal density function and for the logit model 
P(.) is the logistic density function. With (1.12) the estimation of the model is 
theoretically convincing and also easy to handle. Furthermore, the ML-approach 
lends itself for a broad set of tests to evaluate the model and its single variables (see 
Hosmer and Lemeshow (2000) for a comprehensive introduction). 

Usually, the choice of the link function is not theoretically driven. Users familiar 
with the normal distribution will opt for the probit model. Indeed, the differences in 
the results of both classes of models are often negligible. This is due to the fact that 
both distribution functions have a similar form except for the tails, which are 
heavier for the logit model. The logit model is easier to handle, though. First of 
all, the computation of the estimators is easier. However, today computational 
complexity is often irrelevant as most users apply statistical software where the 
estimation algorithms are integrated. What is more important is the fact that the 
coefficients of the logit model can be more easily interpreted. To see this we 
transform the logit model given in (1.9) in the following way: 


= ef 1.1 
TP; e (1.13) 


The left-hand side of (1.13) is the odds, i.e. the relation between the default 
probability and the probability of survival. Now it can be easily seen that a variation 
of a single variable x of one unit has an impact of ex on the odds, when £y denotes 
the coefficient of the variable x,. Hence, the transformed coefficients eÊ are called 
odds-ratios. They represent the multiplicative impact of a borrower’s characteristic 
on the odds. Therefore, for the logit model, the coefficients can be interpreted in a 
plausible way, which is not possible for the probit model. Indeed, the most important 
weakness of binary models is the fact that the interpretation of the coefficients is not 
straightforward. 
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The strengths of logit and probit models can be summarized as: 


e The methods are theoretically sound. 

e The results generated can be interpreted directly as default probabilities. 

e The significance of the model and the individual coefficients can be tested. 
Therefore, the stability of the model can be assessed more effectively than in 
the previous cases. 


1.6 Panel Models 


The methods discussed so far are all cross-sectional methods because all covariates 
are related to the same period. However, typically banks dispose of a set of 
covariates for more than one period for each borrower. In this case it is possible 
to expand the cross-sectional input data to a panel dataset. The main motivation is to 
enlarge the number of available observations for the estimation and therefore 
enhance the stability and the precision of the rating model. Additionally, panel 
models can integrate macroeconomic variables into the model. Macroeconomic 
variables can improve the model for several reasons. First, many macroeconomic 
data sources are more up-to-date than the borrowers’ characteristics. For example, 
financial ratios calculated from balance sheet information are usually updated only 
once a year and are often up to 2 years old when used for risk assessment. The oil 
price, instead, is available on a daily frequency. Secondly, by stressing the macro- 
economic input factors, the model can be used for a form of stress-testing credit 
risk. However, as macroeconomic variables primarily affect the absolute value of 
the default probability, it is only reasonable to incorporate macroeconomic input 
factors into those classes of models that estimate default probabilities. 

In principle, the structure of, for example, a panel logit or probit model remains 
the same as given in the equations of the previous section. The only difference is 
that now the covariates are taken from a panel of data and have to be indexed by an 
additional time series indicator, i.e. we observe x; instead of x;. At first glance panel 
models seem similar to cross-sectional models. In fact, many developers ignore the 
dynamic pattern of the covariates and simply fit logit or probit models. However, 
logit and probit models rely on the assumption of independent observations. 
Generally, cross-sectional data meets this requirement, but panel data does not. 
The reason is that observations from the same period and observations from the 
same borrower should be correlated. Introducing this correlation in the estimation 
procedure is cumbersome. For example, the fixed-effects estimator known from 
panel analysis for continuous dependent variables is not available for the probit 
model. Besides, the modified fixed-effects estimator for logit models proposed by 
Chamberlain (1980) excludes all non-defaulting borrowers from the analysis and 
therefore seems inappropriate. Finally, the random-effects estimators proposed in the 
literature are computationally extensive and can only be computed with specialized 
software. For an econometric discussion of binary panel analysis, refer to Hosmer 
and Lemeshow (2000). 
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1.7 Hazard Models 


All methods discussed so far try to assess the riskiness of borrowers by estimating a 
certain type of score that indicates whether or not a borrower is likely to default 
within the specified forecast horizon. However, no prediction about the exact 
default point in time is made. Besides, these approaches do not allow the evaluation 
of the borrowers’ risk for future time periods given they should not default within 
the reference time horizon. 

These disadvantages can be remedied by means of hazard models, which 
explicitly take the survival function and thus the time at which a borrower’s default 
occurs into account. Within this class of models, the Cox proportional hazard model 
(cf. Cox 1972) is the most general regression model, as it is not based on any 
assumptions concerning the nature or shape of the underlying survival distribution. 
The model assumes that the underlying hazard rate (rather than survival time) is a 
function of the independent variables; no assumptions are made about the nature or 
shape of the hazard function. Thus, the Cox’s regression model is a semiparametric 
model. The model can be written as: 


hj(t|x;) = ho(t) - e8 *, (1.14) 


where h,(t/x;) denotes the resultant hazard, given the covariates for the respective 
borrower and the respective survival time t. The term Ao(t) is called the baseline 
hazard; it is the hazard when all independent variable values are equal to zero. If the 
covariates are measured as deviations from their respective means, ho(t) can be 
interpreted as the hazard rate of the average borrower. 

While no assumptions are made about the underlying hazard function, the model 
equation shown above implies important assumptions. First, it specifies a multipli- 
cative relationship between the hazard function and the log-linear function of the 
explanatory variables, which implies that the ratio of the hazards of two borrowers 
does not depend on time, i.e. the relative riskiness of the borrowers is constant, 
hence the name Cox proportional hazard model. 

Besides, the model assumes that the default point in time is a continuous random 
variable. However, often the borrowers’ financial conditions are not observed 
continuously but rather at discrete points in time. What’s more, the covariates are 
treated as if they were constant over time, while typical explanatory variables like 
financial ratios change with time. 

Although there are some advanced models to incorporate the above mentioned 
features, the estimation of these models becomes complex. The strengths and 
weaknesses of hazard models can be summarized as follows: 


e Hazard models allow for the estimation of a survival function for all borrowers 
from the time structure of historical defaults, which implies that default prob- 
abilities can be calculated for different time horizons. 

e Estimating these models under realistic assumptions is not straightforward. 


1 Statistical Methods to Develop Rating Models 9 


1.8 Neural Networks 


In recent years, neural networks have been discussed extensively as an alternative 
to the (parametric) models discussed above. They offer a more flexible design to 
represent the connections between independent and dependent variables. Neural 
networks belong to the class of non-parametrical methods. Unlike the methods 
discussed so far they do not estimate parameters of a well-specified model. Instead, 
they are inspired by the way biological nervous systems, such as the brain, process 
information. They typically consist of many nodes that send a certain output if they 
receive a specific input from the other nodes to which they are connected. Like 
parametric models, neural networks are trained by a training sample to classify 
borrowers correctly. The final network is found by adjusting the connections 
between the input, output and any potential intermediary nodes. 
The strengths and weaknesses of neural networks can be summarized as: 


e Neural networks easily model highly complex, nonlinear relationships between 
the input and the output variables. 

e They are free from any distributional assumptions. 

e These models can be quickly adapted to new information (depending on the 
training algorithm). 

e There is no formal procedure to determine the optimum network topology for a 
specific problem, i.e. the number of the layers of nodes connecting the input with 
the output variables. 

e Neural networks are black boxes, hence they are difficult to interpret. 

e Calculating default probabilities is possible only to a limited extent and with 
considerable extra effort. 


In summary, neural networks are particularly suitable when there are no expec- 
tations (based on experience or theoretical arguments) on the relationship between 
the input factors and the default event and the economic interpretation of the 
resulting models is of inferior importance. 


1.9 Decision Trees 


A further category of non-parametric methods comprises decision trees, also called 
classification trees. Trees are models which consist of a set of if-then split condi- 
tions for classifying cases into two (or more) different groups. Under these meth- 
ods, the base sample is subdivided into groups according to the covariates. In the 
case of binary classification trees, for example, each tree node is assigned by 
(usually univariate) decision rules, which describe the sample accordingly and 
subdivide it into two subgroups each. New observations are processed down the 
tree in accordance with the decision rules’ values until the end node is reached, 
which then represents the classification of this observation. An example is given in 
Fig. 1.1. 
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Sector 
Construction Other 
Years in business EBIT 
Less than 2 


Equity ratio 
Less than 15% More than 15% 


Risk class 2 Risk class 3 


Fig. 1.1 Decision tree 


One of the most striking differences of the parametric models is that all covari- 
ates are grouped and treated as categorical variables. Furthermore, whether a 
specific variable or category becomes relevant depends on the categories of the 
variables in the upper level. For example, in Fig. 1.1 the variable “years in business” 
is only relevant for companies which operate in the construction sector. This kind of 
dependence between variables is called interaction. 

The most important algorithms for building decision trees are the Classification 
and Regression Trees algorithms (C&RT) popularized by Breiman et al. (1984) and 
the CHAID algorithm (Chi-square Automatic Interaction Detector, see Kass 1978). 
Both algorithms use different criteria to identify the best splits in the data and to 
collapse the categories which are not significantly different in outcome. 

The general strengths and weaknesses of trees are: 


e Through categorization, nonlinear relationships between the variables and the 
score can be easily modelled. 

e Interactions present in the data can be identified. Parametric methods can model 
interactions only to a limited extent (by introducing dummy variables). 

e As with neural networks, decision trees are free from distributional assumptions. 

e The output is easy to understand. 

e Probabilities of default have to be calculated in a separate step. 

e The output is (a few) risk categories and not a continuous score variable. 
Consequently, decision trees only calculate default probabilities for the final 
node in a tree, but not for individual borrowers. 

e Compared to other models, trees contain fewer variables and categories. The 
reason is that in each node the sample is successively partitioned and therefore 
continuously diminishes. 

e The stability of the model cannot be assessed with statistical procedures. The 
strategy is to work with a training sample and a hold-out sample. 
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In summary, trees are particularly suited when the data is characterized by a 
limited number of predictive variables which are known to be interactive. 


1.10 Statistical Models and Basel II 


Finally, we ask the question whether the models discussed in this chapter are in line 
with the IRB Approach of Basel II. Prior to the discussion, it should be mentioned 
that in the Basel documents, rating systems are defined in a broader sense than in 
this chapter. Following § 394 of the Revised Framework from June 2004 (cf. BIS 
2004) a rating system “comprises all the methods, processes, controls, and data 
collection and IT systems that support the assessment of credit risk, the assignment 
of internal ratings, and the quantification of default and loss estimates”. Compared 
to this definition, these methods provide one component, namely the assignment of 
internal ratings. 

The minimum requirements for internal rating systems are treated in Part II, 
Section III, H of the Revised Framework. A few passages of the text concern the 
assignment of internal ratings, and the requirements are general. They mainly 
concern the rating structure and the input data, examples being: 


e A minimum of seven rating classes of non-defaulted borrowers (§ 404) 

e No undue or excessive concentrations in single rating classes (§§ 403, 406) 
e A meaningful differentiation of risk between the classes (§ 410) 

e Plausible, intuitive and current input data (§§ 410, 411) 

e All relevant information must be taken into account (§ 411) 


The requirements do not reveal any preference for a certain method. It is indeed 
one of the central ideas of the IRBA that the banks are free in the choice of the 
method. Therefore the models discussed here are all possible candidates for the IRB 
Approach. 

The strengths and weaknesses of the single methods concern some of the 
minimum requirements. For example, hazard rate or logit panel models are espe- 
cially suited for stress testing (as required by §§ 434, 345) since they contain a time- 
series dimension. Methods which allow for the statistical testing of the individual 
input factors (e.g. the logit model) provide a straightforward way to demonstrate the 
plausibility of the input factors (as required by § 410). When the outcome of the 
model is a continuous variable, the rating classes can be defined in a more flexible 
way (§§ 403, 404, 406). 

On the other hand, none of the drawbacks of the models considered here excludes 
a specific method. For example, a bank may have a preference for linear regression 
analysis. In this case the plausibility of the input factors cannot be verified by 
statistical tests and as a consequence the bank will have to search for alternative 
ways to meet the requirements of § 410. 

In summary, the minimum requirements are not intended as a guideline for the 
choice of a specific model. Banks should rather base their choice on their internal 
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aims and restrictions. If necessary, those components that are only needed for the 
purpose to satisfy the criteria of the IRBA should be added in a second step. All 
models discussed in this chapter allow for this. 
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Chapter 2 
Estimation of a Rating Model for Corporate 
Exposures 


Evelyn Hayden 


2.1 Introduction 


This chapter focuses on the particular difficulties encountered when developing 
internal rating models for corporate exposures. The main characteristic of these 
internal rating models is that they mainly rely on financial ratios. Hence, the aim is 
to demonstrate how financial ratios can be used for statistical risk assessment. The 
chapter is organised as follows: Sect. 2.2 describes some of the issues concerning 
model selection, while Sect. 2.3 presents data from Austrian companies that will 
illustrate the theoretical concepts. Section 2.4 discusses data processing, which 
includes the calculation of financial ratios, their transformation to establish linearity, 
the identification of outliers and the handling of missing values. Section 2.5 describes 
the actual estimation of the rating model, i.e. univariate and multivariate analyses, 
multicollinearity issues and performance measurement. Finally, Sect. 2.6 concludes. 


2.2 Model Selection 


Chapter 1 presents several statistical methods for building and estimating rating 
models. The most popular of these model types — in the academic literature as well 
as in practice — is the logit model, mainly for two reasons. Firstly, the output from 
the logit model can be directly interpreted as default probability, and secondly, the 
model allows an easy check as to whether the empirical dependence between the 
potential explanatory variables and default risk is economically meaningful 
(see Sect. 2.4). Hence, a logit model is chosen to demonstrate the estimation of 
internal rating models for corporate exposures. 
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Next, the default event must be defined. Historically, rating models were 
developed using mostly the default criterion bankruptcy, as this information was 
relatively easily observable. However, banks also incur losses before the event of 
bankruptcy, for example, when they allow debtors to defer payments without 
compensation in hopes that later on, the troubled borrowers will be able to repay 
their debt. Therefore, the Basel Committee on Banking Supervision (2001) defined 
a reference definition of default that includes all those situations where a bank 
looses money and declared that banks would have to use this regulatory reference 
definition of default for estimating internal rating-based models. However, as 
demonstrated in Hayden (2003), rating models developed by exclusively relying 
on bankruptcy as the default criterion can be equally powerful in predicting the 
comprising credit loss events provided in the new Basel capital accord as models 
estimated on these default criteria. In any case, when developing rating models one 
has to guarantee that the default event used to estimate the model is comparable to 
the event the model shall be capable to predict. 

Finally, a forecast horizon must be chosen. As illustrated by the Basel Commit- 
tee on Banking Supervision (1999), even before Basel II for most banks it was 
common habit to use a modelling horizon of one year, as this time horizon is on the 
one hand long enough to allow banks to take some action to avert predicted 
defaults, and on the other hand the time lag is short enough to guarantee the 
timeliness of the data input into the rating model. 


2.3 The Data Set 


The theoretical concepts discussed in this chapter will be illustrated by application 
to a data set of Austrian companies, which represents a small sample of the credit 
portfolio of an Austrian bank. The original data, which was supplied by a major 
commercial Austrian bank for the research project described in Hayden (2002), 
consisted of about 5,000 firm-year observations of balance sheets and gain and loss 
accounts from 1,500 individual companies spanning 1994 to 1999. However, due to 
obvious mistakes in the data, such as assets being different from liabilities or 
negative sales, the data set had to be reduced to about 4,500 observations. Besides, 
certain firm types were excluded, i.e. all public firms including large international 
corporations that do not represent the typical Austrian company and rather small 
single owner firms with a turnover of less than 5 m ATS (about 0.36 m EUR), whose 
credit quality often depends as much on the finances of a key individual as on the 
firm itself. After eliminating financial statements covering a period of less than 
twelve months and checking for observations that were included twice or more in 
the data set, almost 3,900 firm-years were left. Finally, observations were dropped 
where the default information (bankruptcy) was missing or dubious. 

Table 2.1 shows the total number of observed companies per year and splits the 
sample into defaulting and non-defaulting firms. However, the data for 1994 is not 
depicted, as we are going to calculate dynamic financial ratios (which compare 
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Table 2.1 Number of 


b i Fai Year Non-defaulting firms Defaulting firms Total 
ee a 1995 1,185 54 1,239 
1996 616 68 684 

1997 261 46 307 

1998 27 2 29 

1999 23 1 24 

Total 2,112 171 2,283 


current to past levels of certain balance sheet items) later on, and these ratios cannot 
be calculated for 1994 as the first period in the sample. 


2.4 Data Processing 


Section 2.4 discusses the major preparatory operations necessary before the model 
estimation can be conducted. They include the cleaning of the data, the calculation 
of financial ratios, and their transformation to establish linearity. 


2.4.1 Data Cleaning 


Some of the important issues with respect to data cleaning were mentioned in 
Sect. 2.3 when the Austrian data set was presented. As described, it was guaranteed 
that: 


e The sample data was free of (obvious) mistakes 

e The data set comprised only homogeneous observations, where the relationship 
between the financial ratios and the default event could be expected to be 
comparable 

e The default information was available (and reliable) for all borrowers 


In addition, missing information with respect to the financial input data must be 
properly managed. Typically, at least for some borrowers, part of the financial 
information is missing. If the number of the observations concerned is rather low, 
the easiest way to handle the problem is to eliminate the respective observations 
completely from the data set (as implemented for the Austrian data). If, however, 
this would result in too many observations being lost, it is preferable to exclude all 
variables with high numbers of missing values from the analysis. Once the model 
has been developed and is in use, the missing information needed to calculate the 
model output can be handled by substituting the missing financial ratios with the 
corresponding mean or median values over all observations for the respective time 
period (i.e. practically “neutral” values) in order to create as undistorted an assess- 
ment as possible using the remaining input factors. 
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2.4.2 Calculation of Financial Ratios 


Once the quality of the basic financial data is guaranteed, potential explanatory 
variables have to be selected. Typically, ratios are formed to standardise the 
available information. For example, the ratio “Earnings per Total Assets” enables 
a comparison of the profitability of firms of different size. In addition to considering 
ratios that reflect different financial aspects of the borrowers, dynamic ratios that 
compare current to past levels of certain balance sheet items can be very useful for 
predicting default events. Overall, the selected input ratios should represent the 
most important credit risk factors, i.e. leverage, liquidity, productivity, turnover, 
activity, profitability, firm size, growth rates and leverage development. 

After the calculation of the financial input ratios, it is necessary to identify and 
eliminate potential outliers, because they can and do severely distort the estimated 
model parameters. Outliers in the ratios might exist even if the underlying financial 
data is absolutely clean, for example, when the denominator of a ratio is allowed to 
take on values close to zero. To avoid the need to eliminate the affected observa- 
tions a typical procedure is to replace the extreme data points by the 1% respec- 
tively the 99% percentile of the according ratio. 

Table 2.2 portrays the explanatory variables selected for use for the Austrian 
data and presents some descriptive statistics. The indicators chosen comprise a 
small set of typical business ratios. A broader overview over potential input ratios 
as well as a detailed discussion can be found in Hayden (2002). 

The last column in Table 2.2 depicts the expected dependence between the 
accounting ratio and the default probability, where + symbolises that an increase 
in the ratio leads to an increase in the default probability and — symbolises a 
decrease in the default probability given an increase in the explanatory variable. 


Table 2.2 Selected input ratios 


Financial ratio Risk factor Mean Stand. Dev. Min. Max. Hypo. 

1 Total Liabilities/Total Assets Leverage 0.89 0.18 0.02 1.00 + 

2  Equity/Total Assets Leverage —0.04 0.34 —0.92 0.98 — 

3 Bank Debt/T. Assets Leverage 0.39 0.26 0.00 0.97 + 

4 Short Term Debt/Total Assets Liquidity 0.73 0.25 0.02 1.00 + 

5 Current Assets/Current Liquidity 0.08 0.15 0.00 0.72 — 
Liabilities 

6 Accounts Receivable/Net Sales Activity 0.13 0.12 0.00 0.41 + 

7 Accounts Payable/Net Sales Activity 0.12 0.12 0.00 0.44 + 

8 (Net Sales — Material Costs)/ Productivity 2.56 1.85 1.03 8.55 — 
Person. Costs 

9 Net Sales/Total Assets Turnover 1.71 1.08 0.01 4.43 — 

10 EBIT/Total Assets Profitability 0.06 0.13 —0.18 0.39 — 

11 Ordinary Business Income/ Profitability 0.02 0.13 —0.19 0.33 — 
Total Assets 

12 Total Assets (in 1 Mio. EUR) Size 35.30 72.98 0.22 453.80 — 

13 Net Sales/Net Sales last year Growth 1.06 0.34 0.02 2.03 —/+ 

14 Total Liabilities/Liabilities Leverage 1.00 1.03 0.07 1.23 + 


last year Growth 
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Whenever a certain ratio is selected as a potential input variable for a rating model, 
it should be assured that a clear hypothesis can be formulated about this dependence 
to guarantee that the resulting model is economically plausible. Note, however, that 
the hypothesis chosen can also be rather complex; for example, for the indicator 
sales growth, the hypothesis formulated is “—/+”. This takes into account that the 
relationship between the rate at which companies grow and the rate at which they 
default is not as simple as that between other ratios and default. While it is generally 
better for a firm to grow than to shrink, companies that grow very quickly often find 
themselves unable to meet the management challenges presented by such growth — 
especially within smaller firms. Furthermore, this quick growth is unlikely to be 
financed out of profits, resulting in a possible build up of debt and the associated 
risks. Therefore, one should expect that the relationship between sales growth and 
default is non-monotone, what will be examined in detail in the next section. 


2.4.3. Test of Linearity Assumption 


After having selected the candidate input ratios, the next step is to check whether 
the underlying assumptions of the logit model apply to the data. As explained in 
Chap. 1, the logit model can be written as 


eB x 


Py = PQ = 1) = FB x) = ra (2.1) 


which implies a linear relationship between the log odd and the input variables: 


Log odd = in( P; ) = pl -x; (2.2) 
1-P; 

This linearity assumption can be easily tested by dividing the indicators into 
groups that all contain the same number of observations, calculating the historical 
default rate respectively the empirical log odd within each group, and estimating a 
linear regression of the log odds on the mean values of the ratio intervals. 

When applied to the Austrian data (by forming 50 groups), this procedure 
permits the conclusion that for most accounting ratios, the linearity assumption is 
indeed valid. As an example the relationship between the variable “EBIT/Total 
Assets” and the empirical log odd as well as the estimated linear regression is 
depicted in Fig. 2.1. The regression fit is as high as 78.02%. 

However, one explanatory variable, namely sales growth, shows a non-linear 
and even non-monotone behaviour, just as was expected. Hence, as portrayed in 
Fig. 2.2, due to the linearity assumption inherent in the logit model, the relationship 
between the original ratio sales growth and the default event cannot be correctly 
captured by such a model. 
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R2: .7802 


Empirical Log Odd 


EBIT/Total Assets 


+ Log Odds ——*— Fitted values 


Fig. 2.1 Relationship between “EBIT/Total Assets” and log odd 


Empirical Log Odd 


Net Sales/Net Sales Last Year 


° Empirical Log Odd ——*—— Smoothed Values 
——*——_ Linear Prediction 


Fig. 2.2 Relationship between “Sales Growth” and log odd 
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Therefore, to enable the inclusion of the indicator sales growth into the rating 
model, the ratio has to be linearized before logit regressions can be estimated. This 
can be done in the following way: the points obtained from dividing the variable 
sales growth into groups and plotting them against the respective empirical log odds 
are smoothed by a filter, for example the one proposed in Hodrick and Prescott 
(1997), to reduce noise. Then the original values of sales growth are transformed to 
log odds according to this smoothed relationship, and in any further analysis the 
transformed log odd values replace the original ratio as input variable. 

This test for the appropriateness of the linearity assumption also allows for a first 
check as to whether the univariate dependence between the considered explanatory 
variables and the default probability is as expected. For the Austrian data the 
univariate relationships between the investigated indicators and the default event 
coincide with the hypotheses postulated in Table 2.2, i.e. all ratios behave in an 
economically meaningful way. 


2.5 Model Building 


2.5.1 Pre-selection of Input Ratios 


After verifying that the underlying assumptions of a logistic regression are valid, the 
model building process can be started. However, although typically a huge number of 
potential input ratios are available when developing a rating model, from a statistical 
point of view it is not advisable to enter all these variables into the logit regression. If, 
for example, some highly correlated indicators are included in the model, the 
estimated coefficients will be significantly and systematically biased. Hence, it is 
preferable to pre-select the most promising explanatory variables by means of the 
univariate power of and the correlation between the individual input ratios. 

To do so, given the data set at hand is large enough to allow for it, the available 
data should be divided into one development and one validation sample by ran- 
domly splitting the whole data into two sub-samples. The first one, which typically 
contains the bulk of all observations, is used to estimate rating models, while the 
remaining data is left for an out-of-sample evaluation. When splitting the data, it 
should be ensured that all observations of one firm belong exclusively to one of the 
two sub-samples and that the ratio of defaulting to non-defaulting firms is similar in 
both data sets. For the Austrian data, about 70% of all observations are chosen for 
the training sample as depicted in Table 2.3. 

The concrete pre-selection process now looks as follows: At first, univariate logit 
models are estimated in-sample for all potential input ratios, whose power to 
identify defaults in the development sample is evaluated via the criterion of the 
accuracy ratio (AR), a concept discussed in detail in Chap. 13. Afterwards, the 
pairwise correlation between all explanatory variables is computed to identify sub- 
groups of highly correlated indicators, where by rule of thumb ratios with absolute 
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Table 2.3 Division 
of the data into in- 
and out-of-sample 


Year Training sample Validation sample 


Non-defaulting Defaulting Non-defaulting Defaulting 


sübsets 1995 828 43 357 11 
1996 429 44 187 24 
1997 187 25 74 21 
1998 20 2 7 0 
1999 17 1 6 0 


correlation values of above 50% are pooled into one group. Finally, from each 
correlation sub-group (that usually contains only ratios from one specific credit risk 
category) that explanatory variable is selected for the multivariate model building 
process that has got the highest and hence best accuracy ratio in the univariate 
analysis. 

Table 2.4 displays the accuracy ratios of and the correlation between the 
financial ratios calculated for the Austrian data set. As can be seen, explanatory 
variable 1 is highly correlated with indicator 2 (both measuring leverage) and ratio 
10 with variable 11 (both reflecting profitability). Besides, the input ratios 2 and 11 
have got better (higher) accuracy ratios than the indicators 1 respectively 10, hence, 
the latter ones are dropped from the list of explanatory variables for the multivariate 
analysis. 


2.5.2 Derivation of the Final Default Prediction Model 


Those ratios pre-selected in the previous step are now used to derive the final 
multivariate logit model. Usually, however, the number of potential explanatory 
variables is still too high to specify a logit model that contains all of them, because 
the optimal model should contain only a few, highly significant input ratios to avoid 
overfitting. Thus, even in our small example with only 12 indicators being left, we 
would have to construct and compare 2 = 4,096 models in order to determine the 
“best” econometric model and to entirely resolve model uncertainty. This is, of 
course, a tough task, which becomes infeasible for typical short lists of about 30 to 60 
pre-selected input ratios. Therefore, the standard procedure is to use forward/ 
backward selection to identify the final model (see Hosmer and Lemeshow 2000). 

For the Austrian data set backward elimination, one possible method of these 
statistical stepwise variable selection procedures that is implemented in most 
statistical software packages, was applied to derive the final logit model. This 
method starts by estimating the full model (with all potential input ratios) and 
continues by eliminating the worst covariates one by one until the significance level 
of all remaining explanatory variables is below the chosen critical level, usually set 
at 90% or 95%. 

Table 2.5 describes two logit models derived by backward elimination for the 
Austrian data. It depicts the constants of the logit models and the estimated coefficients 
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Table 2.5 Estimates of multivariate logit models 


Financial ratio Risk factor Model 1 Model 2 Hypo. 
(final M.) 
2 Equity/Total Assets Leverage —0.98** —0.85** = 
3 Bank Debt/Total Assets Leverage 1.55*** L.21*** + 
4 Short Term Debt/Total Assets Liquidity 1.30** 1.56*** Æ 
6 Accounts Receivable/Net Sales Activity 1.71* + 
7 Accounts Payable/Net Sales Activity 2.31** 1.53* + 
8 (Net Sales — Material Costs)/Personnel Productivity —0.23*** —0.23*** — 
Costs 
9 Net Sales/Total Assets Turnover 0.26** — 
Constant —1.18 —0.95 


for all those financial ratios that enter into the respective model. The stars represent 
the significance level of the estimated coefficients and indicate that the true 
parameters are different from zero with a probability of 90% (*), 95% (**) or 
99% (***), 

Model 1 arises if all 12 pre-selected variables are entered into the backward 
elimination process. Detailed analysis of this model shows that most signs of the 
estimated coefficients correspond to the postulated hypotheses, however, the model 
specifies a positive relationship between the ratio number 9 “Net Sales/Total 
Assets”, while most empirical studies find that larger firms default less frequently. 
What’s more, even for our data sample a negative coefficient was estimated in 
the univariate analysis. For this reason, a closer inspection of input ratio 9 seems 
appropriate. 

Although the variable “Net Sales/Total Assets” does not exhibit a pairwise 
correlation of more than 50%, it shows absolute correlation levels of about 30% 
with several other covariates. This indicates that this particular ratio is too highly 
correlated (on a multivariate basis) with the other explanatory variables and has to 
be removed from the list of variables entering the backward elimination process. 

Model 2 in Table 2.5 depicts the resulting logit model. Here all coefficients are 
of comparable magnitude to those of model 1, except that the ratio “Accounts 
Receivable/Net Sales” becomes highly insignificant and is therefore excluded from 
the model. As a consequence, all estimated coefficients are now economically 
plausible, and we accept model 2 as our (preliminary) final model version. 


2.5.3 Model Validation 


Finally, the derived logit model has to be validated. In a first step, some statistical 
tests should be conducted in order to verify the model’s robustness and goodness of 
fit in-sample, and in a second step the estimated model should be applied to the 
validation sample to produce out-of-sample forecasts, whose quality can be eva- 
luated with the concept of the accuracy ratio and other methods depicted in 
Chap. 13. 
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The goodness-of-fit of a logit model can be assessed in two ways: first, on the 
basis of some test statistics that use various approaches to measure the distance 
between the estimated probabilities and the actual defaults, and second, by analys- 
ing individual observations which can each have a certain strong impact on the 
estimated coefficients (for details see Hosmer and Lemeshow 2000). 

One very popular goodness-of-fit test statistic is the Hosmer-Lemeshow test 
statistic that measures how well a logit model represents the actual probability of 
default for groups of firms of differently perceived riskiness. Here, the observations 
are grouped based on percentiles of the estimated default probabilities. For the 
Austrian data 10% intervals were used i.e. ten groups were formed. Now for every 
group the average estimated default probability is calculated and used to derive the 
expected number of defaults per group. Next, this number is compared with the 
amount of realised defaults in the respective group. The Hosmer-Lemeshow test 
statistic then summarises this information for all groups. In our case of ten groups 
the test statistic for the estimation sample is chi-square distributed with 8 degrees of 
freedom, and the corresponding p-value for the rating model can then be calculated 
as 79.91%, which indicates that the model fits quite well. 

However, the Hosmer-Lemeshow goodness-of-fit test can also be regarded from 
another point of view for the application at hand. Until now we only dealt with the 
development of a model that assigns each corporation a certain default probability 
or credit score, which leads towards a ranking between the contemplated firms. 
However, in practice banks usually want to use this ranking to map the companies 
to an internal rating scheme that typically is divided into about ten to twenty rating 
grades. The easiest way to do so would be to use the percentiles of the predicted 
default probabilities to build groups. If for example ten rating classes shall be 
formed, then from all observations the 10% with the smallest default probabilities 
would be assigned the best rating grade, the next 10% the second and so on till the 
last 10% with the highest estimated default probabilities would enter into the worst 
rating class. The Hosmer-Lemeshow test now tells us that, given one would apply 
the concept described above to form rating categories, overall the average expected 
default probability per rating grade would fit with the observed default experience 
per rating class. 

What’s more, as depicted in Table 2.6, the in-sample accuracy ratio is about 
44%, which is a reasonable number. Usually the rating models for corporate 
exposures presented in the literature have an accuracy ratio between 40% and 
70%. As discussed in Chap. 13 in detail, AR can only be compared reliably for 
models that are applied to the same data set, because differences in the data set such 
as varying relative amounts of defaulters or non-equal data reliability drives this 
measure heavily, hence, an AR of about 44% seems satisfactory. 


Table 2.6 Validation results of the final logit model 


Final model Accuracy OAR 95% conf. interval Hosmer-Lemeshow 
(model 2) ratio test statistic p-value 
In-sample 0.4418 0.0444 (0.3574, 0.5288] 79.91% 


Out-of-sample 0.4089 0.0688 [0.2741, 0.5438] 68.59% 
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Finally, the out-of-sample accuracy ratio amounts to about 41%, which is almost 
as high as the in-sample AR. This implies that the derived rating model is stable and 
powerful also in the sense that it produces accurate default predictions for new data 
that was not used to develop the model. Therefore, we can now eventually accept 
the derived logit model as our final rating tool. 


2.6 Conclusions 


This chapter focused on the special difficulties that are encountered when develop- 
ing internal rating models for corporate exposures. Although the whole process 
with data collection and processing, model building and validation usually takes 
quite some time and effort, the job is not yet completed with the implementation of 
the derived rating model. The predictive power of all statistical models depends 
heavily on the assumption that the historical relationship between the model’s 
covariates and the default event will remain unchanged in the future. Given the 
wide range of possible events such as changes in firms’ accounting policies or 
structural disruptions in certain industries, this assumption is not guaranteed over 
longer periods of time. Hence, it is necessary to revalidate and eventually recali- 
brate the model regularly in order to ensure that its predictive power does not 
diminish. 
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Chapter 3 
Scoring Models for Retail Exposures 


Daniel Porath 


3.1 Introduction 


Rating models for retail portfolios deserve a more detailed examination because 
they differ from other bank portfolios. The differences can mainly be attributed to 
the specific data structure encountered when analyzing retail exposures. One 
implication is that different statistical tools have to be used when creating the 
model. Most of these statistical tools do not belong to the banker’s standard 
toolbox. At the same time — and strictly speaking for the same reason — the banks’ 
risk management standards for retail exposures are not comparable to those of other 
portfolios. 

Banks often use scoring models for managing the risk of their retail portfolios. 
Scoring models are statistical risk assessment tools especially designed for retail 
exposures. They were initially introduced to standardize the decision and monitor- 
ing process. With respect to scoring, the industry had established rating standards 
for retail exposures long before the discussion about the IRBA emerged. The Basel 
Committee acknowledged these standards and has modified the minimum require- 
ments for the internal rating models of retail exposures. The aim of this chapter is to 
discuss scoring models in the light of the minimum requirements and to introduce 
the non-standard statistical modelling techniques which are usually used for building 
scoring tables. 

The discussion starts with an introduction to scoring models comprising a 
general description of scoring, a distinction of different kinds of scoring models 
and an exposure of the theoretical differences compared to other parametric rating 
models. In Sect. 3.3, we extract the most important minimum requirements for 
retail portfolios from the New Basel Capital Framework and consider their rele- 
vance for scoring models. Section 3.4 is dedicated to modelling techniques. Here, 
special focus is placed on the preliminary univariate analysis because it is 
completely different from other portfolios. We conclude with a short summary. 
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3.2 The Concept of Scoring 


3.2.1 What is Scoring? 


Like any rating tool, a scoring model assesses a borrower’s creditworthiness. The 
outcome of the model is expressed in terms of a number called “score”. Increasing 
scores usually indicate declining risk, so that a borrower with a score of 210 is more 
risky than a borrower with a score of 350. A comprehensive overview about scoring 
can be found in Thomas et al. (2002). 

The model which calculates the score is often referred to as a scoring table, 
because it can be easily displayed in a table. Table 3.1 shows an extract of two 
variables from a scoring model (usually scoring models consist of about 7 up to 15 
variables): 

The total customer score can be calculated by adding the scores of the bor- 
rower’s several characteristics. Each variable contains the category “neutral”. The 
score of this category represents the portfolio mean of the scores for a variable and 
therewith constitutes a benchmark when evaluating the risk of a specific category. 
Categories with higher scores than “neutral” are below the average portfolio risk 
and categories with lower scores are more risky than the average. For example, 
divorced borrowers display increased risk compared to the whole portfolio, because 
for the variable “marital status” the score of a divorced borrower (16) is lower than 
the score for the category “neutral” (19). 

Scoring models usually are estimated with historical data and statistical meth- 
ods. The historical data involves information about the performance of a loan 
(“good” or “bad”) and about the characteristics of the loan some time before. The 
time span between the measurement of the characteristic on the one hand and the 
performance on the other hand determines the forecast horizon of the model. 

Estimation procedures for scoring models are logistic regression, discriminant 
analysis or similar methods. The estimation results are the scores of the single 


Table 3.1 Extract from a Variable Score of the variables’ 
scoring table attributes 
Marital status of borrower 
Unmarried 20 
Married or widowed 24 
Divorced or separated 16 
No answer 16 
Neutral 19 
Age of borrower 
18 < 24 14 
24 < 32 16 
32 < 38 25 
38 < 50 28 
50 < 65 30 
65 or older 32 


Neutral 24 
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characteristics. Usually the scores are rescaled after estimation in order to obtain 
round numbers as in the example shown in Table 3.1. More details regarding 
estimation of the scores are shown in Sect. 3.4. 


3.2.2 Classing and Recoding 


Scoring is a parametric rating model. This means that modelling involves the 
estimation of the parameters fo,...,8, in a general model 


Si = Po + Pixa + Boxing +... + Byxin- (3.1) 


Here S; denotes the Score of the loan i = 1,...,J and x),....xy are the input 
parameters or variables for the loan i. The parameters f,, (n = 0,...,N) reflect the 
impact of a variation of the input factors on the score. 

Scoring differs from other parametric rating models in the treatment of the input 
variables. As can be seen from Table 3.1, the variable “marital status” is a qualita- 
tive variable, therefore it enters the model categorically. Some values of the 
variable have been grouped into the same category, like for example “married” 
and “widowed” in order to increase the number of borrowers within each class. The 
grouping of the values of a variable is a separate preliminary step before estimation 
and is called “classing”. 

The general approach in (3.1) cannot manage categorical variables and therefore 
has to be modified. To this end, the (categorical) variable x„ has to be recoded. 
An adequate recoding procedure for scoring is to add the category “neutral” to the 
existing number of C categories and replace x, by a set of dummy variables dyn(c)» 
c = 1,...,C which are defined in the following way: 


1 forx,=c 
dine) = 4 —1 for x, = ‘‘neutral’’ (3.2) 
0 else. 


The recoding given in (3.2) is called effect coding and differs from the standard 
dummy variable approach where the dummies only take the values 0 and 1. The 
benefit from using (3.2) is that it allows for the estimation of a variable-specific 
mean which is the score of the category “neutral”. As can be seen from (3.2), the 
value of the category “neutral” is implicitly given by the vector of dummy values 
(—1,...,-1). The coefficients of the other categories then represent the deviation 
from the variable-specific mean. 

This can be illustrated by recoding and replacing the first variable xj, in (3.1). 
Model (3.1) then becomes 


Si = Bo g (Bio T Bide, F Birdy, Feart Bicdx.c;) + Boxi2 Ferit ByXin- 
(3.3) 


28 D. Porath 


Here (B19 — Bi, — Bi2 — + — Bic) is the variable-specific average (“neutral”) 
and the coefficients f,,,...,6,c represent the deviation of the individual categories 
from the average. The scores of the single categories (see Table 3.1) are given by 
the sums Bio + Bit, Bio + Bi2,---, Bio + Bic. 

Apart from the special recoding function (3.2), the procedure discussed so far is 
the standard procedure for handling categorical variables. The major characteristic 
of scoring is that the same procedure is conducted for the quantitative variables. 
This means that all variables are classed and recoded prior to estimation and 
therefore are treated as categorical variables. As a consequence, the overall mean 
Bo in (3.3) disappears and the model can be rewritten as: 


Si = (Bio T Bide, eee Bicdxc,) Peest (Bno T Bui dey, Feret Bucdnc;) + 
(3.4) 


With an increasing number of variables and categories, equation (3.4) soon 
becomes unmanageable. This is why scoring models are usually displayed in tables. 

The effect of classing and recoding is twofold: On the one hand, the information 
about the interclass variation of the quantitative variable disappears. As can be seen 
from Table 3.1, an increasing age reduces risk. The model, however, does not 
indicate any difference between the age of 39 and 49, because the same score is 
attributed to both ages. If the variable age entered the model as a quantitative 
variable with the estimated coefficient Page, any difference in age (Aage) would be 
captured by the model (its effect on risk, i.e. the score, ceteris paribus, being fage ` 
Aage). On the other hand, categorization allows for flexible risk patterns. Referring 
again to the example of age, the impact on risk may be strong for the lower age 
categories while diminishing for increasing ages. Such a nonlinear impact on the 
score S; can be modelled by selecting narrow classes for lower ages and broad 
classes for higher ages. The quantitative model, on the contrary, attributes the same 
impact of Page to a one-year change in age starting from any level. Thus, classing 
and recoding is an easy way to introduce nonlinearities in the model. 

The theoretical merits from classing and recoding, however, were not pivotal for 
the wide use of scoring models. The more important reason for classing and recoding 
is that most of the risk-relevant input variables for retail customers are qualitative. 
These are demographic characteristics of the borrower (like marital status, gender, or 
home ownership), the type of profession, information about the loan (type of loan, 
intended use) and information about the payment behaviour in the past (due payment 
or not). The reason for transforming the remaining quantitative variables (like age or 
income) into categorical variables is to obtain a uniform model. 


3.2.3 Different Scoring Models 


Banks use different scoring models according to the type of loan. The reason is that 
the data which is available for risk assessment is loan-specific. For example, the 
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scoring of a mortgage loan can make use of all the information about the real estate 
whereas there is no comparable information for the scoring model of a current 
account. On the other hand, models for current accounts involve much information 
about the past payments observed on the account (income, drawings, balance) 
which are not available for mortgage loans. For mortgage loans, payment informa- 
tion generally is restricted to whether the monthly instalment has been paid or not. 
As a consequence, there are different models for different products and when the 
same person has two different loans at the same bank, he or she generally will have 
two different scores. This is a crucial difference to the general rating principles of 
Basel II. 

Scoring models which are primarily based on payment information are called 
behavioural scoring. The prerequisite for using a behavioural score is that the bank 
observes information about the payment behaviour on a monthly basis, so that the 
score changes monthly. Furthermore, in order to obtain meaningful results, at least 
several monthly payment transactions should be observed for each customer. Since 
the behavioural score is dynamic, it can be used for risk monitoring. Additionally, 
banks use the score for risk segmentation when defining strategies for retail 
customers, like for example cross-selling strategies or the organization of the 
dunning process (“different risk, different treatment’). 

When payment information is sporadic, it is usually not implemented in the 
scoring model. The score then involves static information which has been queried 
in the application form. This score is called an application score. In contrast to the 
behavioural score, the application score is static, i.e. once calculated it remains 
constant over time. It is normally calculated when a borrower applies for a loan and 
helps the bank to decide whether it should accept or refuse the application. 
Additionally, by combining the score with dynamic information it can be used as 
a part of a monitoring process. 


3.3 Scoring and the IRBA Minimum Requirements 


Internal Rating systems for retail customers were in use long before Basel II. The 
reason is that statistical models for risk assessment are especially advantageous for 
the retail sector: on the one hand, the high granularity of a retail portfolio allows 
banks to realize economies of scale by standardization of the decision and monitor- 
ing processes. On the other hand, the database generally consists of a broad number 
of homogenous data. Homogeneity is owed to standardized forms for application 
and monitoring. As a consequence, the database is particularly suited for modelling. 
In fact, statistical procedures for risk forecasting of retail loans have a history of 
several decades (cf. Hand 2001), starting with the first attempts in the 1960s and 
coming into wide use in the 1980s. Today, scoring is the industrial standard for the 
rating of retail customers. Since these standards have developed independently 
from the New Basel Capital Approach, there are some differences to the IRBA 
minimum requirements. The Capital Accord has acknowledged these differences 
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and consequently modified the rules for retail portfolios. Hence most banks will 
meet the minimum requirements, possibly after some slight modifications of their 
existing scoring systems. In the following subsections we discuss the meaning of 
some selected minimum requirements for scoring and therewith give some sugges- 
tions about possible modifications. The discussion is restricted to the minimum 
requirements, which according to our view, are the most relevant for scoring. We 
refer to the Revised Framework of the Basel Committee on Banking Supervision 
from June 2004 (cf. BIS 2004) which for convenience in the following is called 
Capital Framework. 


3.3.1 Rating System Design 


Following § 394 of the Capital Framework, a rating system comprises the assign- 
ment of a rating to credit risk and the quantification of default and loss estimates. 
However, scoring models only provide the first component, which is the score S;. 
The default and loss estimates (which in the Capital Framework are PD, LGD, and 
EAD) usually are not determined by the scoring model. When a bank intends to use 
a scoring model for the IRBA, these components have to be assessed separately. 


3.3.2 Rating Dimensions 


Generally, the IRBA requires a rating system to be separated by a borrower-specific 
component and a transaction-specific component (see § 396 of the Capital Frame- 
work). However, in the previous section we have seen that scoring models typically 
mix variables about the borrower and the type of loan. In order to render scoring 
models eligible to the IRBA, the Basel Committee has modified the general 
approach on the rating dimensions for retail portfolios. According to § 401 of the 
Capital Framework both components should be present in the scoring model, but 
need not be separated. Consequently, when referring to the risk classification of 
retail portfolios, the Capital Framework uses the term pool instead of rating grade. 

With § 401, banks have greater flexibility when defining pools, as long as the 
pooling is based on all risk-relevant information. Pools can be customer-specific or 
loan-specific (like in a scoring model) or a mixture of both. A further consequence 
of § 401 is that one the same borrower is allowed to have two different scores. 


3.3.3 Risk Drivers 


Paragraph 402 of the Capital Framework specifies the risk drivers banks should use 
in a scoring model. These cover borrower characteristics, transaction characteristics 
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and delinquency. As seen in the previous section, borrower and transaction char- 
acteristics are integral parts of a scoring table. Delinquency, on the other hand, is 
not usually integrated in a scoring model. The rationale is that scoring aims at 
predicting delinquency and that therefore no forecast is needed for a delinquent 
account. However, a correct implementation of a scoring model implies that 
delinquent accounts are separated (and therefore identified), so that the calculation 
of the score can be suppressed. Hence, when using a scoring model, normally all 
risk drivers mentioned in § 402 of the Capital Framework are integrated. 


3.3.4 Risk Quantification 


Risk quantification in terms of Basel II is the assessment of expected loss as the 
product from PD, LGD and EAD. Since the expected loss of a loan determines the 
risk weight for the capital requirement, the regulatory capital framework contains 
precise definitions for the quantification of these components. This means that the 
underlying time horizon is fixed to 1 year and that the underlying default event is 
explicitly defined. 

Scoring models generally do not follow these definitions since their primary aim 
is not to fulfil the supervisory requirements but to provide internal decision support. 
The application score, for example, tells whether an application for a loan should be 
accepted or refused and for this decision it would not suffice to know whether the 
loan will default in the following year only. Instead, the bank is interested to know 
whether the loan will default in the long run, and therefore scoring models generally 
provide long-run predictions. Additionally, the default event sets as soon as the loan 
becomes no longer profitable for the bank and this is usually not the case when the 
loan defaults according the Basel definition. It depends, instead, on the bank’s 
internal calculation. 

To sum up, scoring models used for internal decision support generally will not 
comply with the requirements about risk quantification. A strategy to conserve the 
power of an internal decision tool and at the same time achieve compliance with the 
minimum requirements is: 


e Develop the scoring model with the internal time-horizons and definitions of 
default. 

e Define the pools according to § 401 of the Capital Framework. 

e Estimate the pool-specific PD, LGD and EAD following the Basel definitions in 
a separate step. 


Finally, it should be noted that the time horizon for assigning scores is not 
specified in the Basel Accord. In paragraph 414 of the Capital Framework it is 
stated that the horizon should be generally longer than 1 year. The long-term 
horizon normally used by scoring systems therefore is conforming to the minimum 
requirements. 
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3.3.5 Special Requirements for Scoring Models 


In § 417 the Capital Framework explicitly refers to scoring models (and other 
statistical models) and specifies some additional requirements. The rationale is 
that the implementation of a scoring model leads to highly standardized decision 
and monitoring processes where failures may be overlooked or detected too late. 
Therefore, the requirements given in § 417 refer to special qualitative features of the 
model and special control mechanisms. 

These requirements will generally be met when banks follow the industrial 
standards for the development and implementation of scoring models. The most 
important standards which have to be mentioned in this context are: 


e The use of a representative database for the development of the model 
e Documentation about the development including univariate analysis 

e Preparation of a user’s guide 

e Implementation of a monitoring process 


3.4 Methods for Estimating Scoring Models 


The statistical methods which are suitable for estimating scoring models comprise 
the techniques introduced in Chap. 1, e.g. logit analysis, or discriminant analysis, 
with the special feature that all input variables enter the model as categorical 
variables. This requires an extensive preliminary data analysis which is referred 
to as “univariate analysis”. Univariate analysis generally is interesting for rating 
analysis because it serves to detect problems concerning the data and helps to 
identify the most important risk-drivers. However, for retail portfolios, univariate 
analysis is more complex and more important than in the general case. There are 
several reasons for this: 


e Univariate analysis determines the classes on which the recoding is based (see 
Sect. 3.2) and hereby becomes an integral part of the model-building process. 

e In retail portfolios, qualitative information is predominant (e.g. a person’s 
profession, marital status). 

e In retail portfolios, many qualitative variables are hard factors and do not 
involve human judgement. Examples include a person’s profession, marital 
status and gender. Note that qualitative information encountered in rating sys- 
tems for corporate loans, often require personal judgement on part of the analyst 
(e.g. a company’s management, the position in the market or the future develop- 
ment of the sector where the company operates). 

e For retail portfolios, a priori, it is often unknown whether a variable is relevant 
for the risk assessment. For example, there is no theory which tells whether a 
borrower’s profession, gender or domicile helps in predicting default. This is 
different for the corporate sector where the main information consists of finan- 
cial ratios taken from the balance sheet. For example, EBIT ratios measure the 
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profitability of a firm and since profitability is linked to the firm’s financial 
health, it can be classified as a potential risk factor prior to the analysis. For retail 
portfolios, univariate analysis replaces a priori knowledge and therefore helps to 
identify variables with a high discriminatory power. 

e Often, the risk distribution of a variable is unknown a priori. This means that 
before analyzing a variable, it is not clear which outcomes correlate with high 
risks and which outcomes correlate with low risks. This is completely different 
from the corporate sector, where for many financial ratios, the risk patterns are 
well-known. For example, it is a priori known that ceteris paribus, high profit- 
ability leads to low risk and vice versa. For retail portfolios, the risk distribution 
has to be determined with the help of univariate analysis. 


The consequences are twofold: On one hand, univariate analysis is particularly 
important for replacing a priori knowledge. On the other hand, the statistical 
methods applied in the univariate analysis should be designed to handle qualitative 
hard factors. 

The basic technique for creating a scoring model is crosstabulation. Crosstabs 
display the data in a two-dimensional frequency table, where the rows c = 1,...,C 
are the categories of the variable and the columns are the performance of the loan. 
The cells contain the absolute number of loans included in the analysis. Cross- 
tabulation is flexible because it works with qualitative data as well as quantitative 
data — quantitative information simply has to be grouped beforehand. A simple 
example for the variable “marital status” is displayed in Table 3.2. 

The crosstab is used to assess the discriminative power. The discriminative 
power of a variable or characteristic can be described as its power to discriminate 
between good and bad loans. However, it is difficult to compare the absolute figures 
in the table. In Table 3.2, the bank has drawn a sample of the good loans. This is a 
common procedure, because often it is difficult to retrieve historical data. As a 
consequence, in the crosstab, the number of good loans cannot be compared to the 
number of bad loans of the same category. It is therefore reasonable to replace 
the absolute values by the column percentages for the good loans P(c/Good) and for 
the bad loans P(c/Bad), see Table 3.3. 


Table 3.2 Crosstab for the 


g b Marital status of borrower No. of good loans No. of bad loans 
variable “Marital status” 


Unmarried 700 500 
Married or widowed 850 350 
Divorced or separated 450 650 
Table 3.3 a ae Marital status of borrower P(c/Good) P(c/Bad) WoE. 
perceniages, rokan Unmarried 0.3500 0.3333 0.0488 
Married or widowed 0.4250 0.2333 0.5996 
Divorced or separated 0.2250 0.4333 —0.6554 


IV 0.2523 
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The discriminative power can be assessed by regarding the risk distribution of 
the variable which is shown by the Weight of Evidence WoE, (see Good 1950). The 
Weight of Evidence can be calculated from the column percentages with the 
following formula: 


WoE, = In(P(c|Good)) — \n(P(c|Bad)). (3.5) 


The interpretation of WoE, is straightforward: Increasing values of the Weight of 
Evidence indicate decreasing risk. A value of WoE. > 0 (WoE. < 0) means that in 
category c good (bad) loans are over-represented. In the above example, the Weight 
of Evidence shows that loans granted to married or widowed customers have 
defaulted with a lower frequency than those granted to divorced or separated 
customers. The value of WoE, close to 0 for unmarried customers displays that 
the risk of this group is similar to the average portfolio risk. 

The Weight of Evidence can also be interpreted in terms of the Bayes theorem. 
The Bayes theorem expressed in log odds is 


P(Good|c) _,_ P(c\Good) P(Good) 
"P(Bad\c) — |" P(c[Bad) * '" Pad) on) 


Since the first term on the right hand of (3.6) is the Weight of Evidence, it 
represents the difference between the a posteriori log odds and the a priori log odds. 
The value of WoE, therefore measures the improvement of the forecast through the 
information of category c. Hence it is a performance measure for category c. 

A comprehensive performance measure for all categories of an individual 
variable can be calculated as a weighted average of the Weights of Evidence for 
all categories c = 1,...,C. The result is called Information Value, ZV (cf. Kullback 
1959) and can be calculated by: 


C 
IV =Y Woe. (P(clGooa) = P(c|Bad) GT) 
c=1 


A high value of IV indicates a high discriminatory power of a specific variable. 
The Information Value has a lower bound of zero but no upper bound. In the 
example of Table 3.3, the Information Value is 0.2523. Since there is no upper 
bound, from the absolute value we cannot tell whether the discriminatory power is 
satisfactory or not. In fact, the Information Value is primarily calculated for the 
purpose of comparison to other variables or alternative classings of the same 
variable and the same portfolio. 

The Information Value has the great advantage of being independent from the 
order of the categories of the variable. This is an extremely important feature when 
analyzing data with unknown risk distribution. It should be noted that most of the 
better-known performance measures like the Gini coefficient or the power curve do 
not share this feature and therefore are of limited relevance only for the univariate 
analysis of retail portfolios. 
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Crosstabulation is a means to generate classings which are needed for the 
recoding and estimation procedures. There are three requirements for a good 
classing. First, each class should contain a minimum number of good and bad 
loans, otherwise the estimation of the coefficients f in (3.4) tend to be imprecise. 
Following a rule of thumb there should be at least 50 good loans and 50 bad loans in 
each class. Probably this is why in the above example there is no separate category 
“widowed”. Second, the categories grouped in each class should display a similar 
risk profile. Therefore, it is feasible to combine the categories “separated” and 
“divorced” to one single class. Third, the resulting classing should reveal a plausi- 
ble risk pattern (as indicated by the Weight of Evidence) and a high performance (as 
indicated by a high Information Value). 

Fixing a classing is complex, because there is a trade-off between the require- 
ments. On one hand, the Information Value tends to increase with an increasing 
number of classes, on the other hand, estimation of the coefficients B tends to 
improve when the number of classes decreases. 

In order to fix the final classing analysts produce a series of different crosstabs 
and calculate the corresponding Weights of Evidence and Information Values. 
Finally, the best classing is selected according to the criteria above. The final 
classing therefore is the result of a heuristic process which is strongly determined 
by the analyst’s know-how and experience. 


3.5 Summary 


In this section, we briefly summarise the ideas discussed here. We have started from 
the observation that for retail portfolios, the methods for developing rating models 
are different from those applied to other portfolios. This is mainly due to the 
different type of data typically encountered when dealing with retail loans: First, 
there is a predominance of hard qualitative information which allows the integra- 
tion of a high portion of qualitative data in the model. Second, there is little 
theoretical knowledge about the risk relevance and risk distribution of the input 
variables. Therefore, analyzing the data requires special tools. Finally, there is a 
high amount of comparably homogenous data. As a consequence, statistical risk 
assessment tools were developed long before rating models for other banks’ 
portfolios have boosted and the standards have been settled independently from 
Basel II. The standard models for the rating of retail portfolios are scoring models. 
Generally, scoring models comply with the IRBA minimum requirements as long 
as they fulfil the industrial standards. However, usually they only constitute risk 
classification systems in terms of the IRBA and it will be necessary to add a 
component which estimates PD, EAD and LGD. 

The estimation of a scoring model requires the classing of all individual vari- 
ables. This is done in a preliminary step called univariate analysis. The classings 
can be defined by comparing the performance of different alternatives. Since risk 
distribution of the variables is often completely unknown, the univariate analysis 
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should rely on performance measures which are independent from the ordering of 
the single classes, like for example the Weight of Evidence and the Information 
Value. Once the classing is settled the variables have to be recoded in order to build 
the model. Finally, the model can be estimated with standard techniques like logit 
analysis or discriminant analysis. 
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Chapter 4 
The Shadow Rating Approach: Experience 
from Banking Practice 


Ulrich Erlenmaier 


4.1 Introduction 


In this article we will report on some aspects of the development of shadow rating 
systems found to be important when re-devising the rating system for large cor- 
porations of KfW Bankengruppe (KfW banking group). The article focuses on 
general methodological issues and does not necessarily describe how these issues 
are dealt with by KfW Bankengruppe. Moreover, due to confidentiality we do not 
report estimation results that have been derived. In this introductory section we 
want to describe briefly the basic idea of the shadow rating approach (SRA), then 
summarise the typical steps of SRA rating development and finally set out the scope 
of this article. 

The shadow rating approach is typically employed when default data are rare 
and external ratings from the three major rating agencies (Standard & Poor’s, 
Moody’s or Fitch) are available for a significant and representative part of the 
portfolio. As with other approaches to the development of rating systems, the first 
modelling step is to identify risk factors — such as balance sheet ratios or qualitative 
information about a company — that are supposed to be good predictors of future 
defaults. The SRA’s objective is to choose and weight the risk factors in such a way 
as to mimic external ratings as closely as possible when there is insufficient data to 
build an explicit default prediction model (the latter type of model is e.g. described 
in Chap. 1. To make the resulting rating function usable for the bank’s internal risk 
management as well as for regulatory capital calculation, the external rating grades 
(AAA, AA, etc.) have to be calibrated, i.e., a probability of default (PD) has to be 
attached to them. With these PDs, the external grades can then be mapped to the 
bank’s internal rating scale. 


The opinions expressed in this article are those of the author and do not reflect views of KfW 
Bankengruppe (or models applied by the bank). 
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The following modular architecture is typical for SRA but also for other types of 
rating systems: 


. Statistical model 

. Expert-guided adjustments 

. Corporate group influences/Sovereign support 
. Override 


AUNE 


The statistical model constitutes the basis of the rating system and will most 
likely include balance sheet ratios, macroeconomic variables (such as country 
ratings or business cycle indicators) and qualitative information about the company 
(such as quality of management or the company’s competitive position). The 
statistical model will be estimated from empirical data that bring together compa- 
nies’ risk factors on the one hand and their external ratings on the other hand. The 
model is set up to predict external ratings — more precisely, external PDs — as 
efficiently as possible from the selected risk factors. 

The second modelling layer of the rating system, that we have termed “Expert- 
guided adjustments” will typically include risk factors for which either no historical 
information is available or for which the influence on external ratings is difficult to 
estimate empirically.' Consequently, these risk factors will enter the model in the 
form of adjustments that are not estimated empirically but that are determined by 
credit experts. 

The third modelling layer will take into account the corporate group to which the 
company belongs or probably some kind of government support.” This is typically 
done by rating both the obligor on a standalone basis and the entity that is supposed 
to influence the obligor’s rating. Both ratings are then aggregated into the obligor’s 
overall rating where the aggregation mechanism will depend on the degree of 
influence that the corporate group/sovereign support are assessed to have. 

Finally, the rating analyst will have the ability to override the results as derived 
by steps 1-3 if she thinks that — due to very specific circumstances — the rating 
system does not produce appropriate results for a particular obligor. 

This article will focus on the development of the rating system’s first module, the 
statistical model.* The major steps in the development of the statistical model are: 


'This occurs e.g. when a new risk factor has been introduced or when a risk factor is relevant only 
for a small sub-sample of obligors. 

There also might be other types of corporate relationships that can induce the support of one 
company for another one. For example, a company might try to bail out an important supplier 
which is in financial distress. However, since this issue is only a minor aspect of this article we will 
concentrate on the most common supporter-relationship in rating practice, i.e. corporate groups 
and sovereign support. 

3We will, however, also include a short proposal for the empirical estimation of corporate group 
influences/sovereign support (step 3). 
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1. Deployment of software tools for all stages of the rating development process 

2. Preparation and validation of the data needed for rating development (typically 
external as well as internal data sets)* 

. Calibration of external ratings 

. Sample construction for the internal rating model 

. Single (univariate) factor analysis 

. Multi factor analysis and validation 

. Impact analysis 

. Documentation 


Oo NDA 


This article deals with steps 3—6, each of which will be presented in one separate 
section. Nevertheless, we want to provide comments on the other steps and empha- 
sise their relative importance both in qualitative as in quantitative terms for the 
success of a rating development project: 


e Initial project costs (i.e. internal resources and time spent for the initial develop- 
ment project) will be very high and mainly driven by steps 1-3 (but also 8) with 
step | being the single biggest contributor. In contrast, follow-up costs (future 
refinement projects related to the same rating system) can be expected to be 
much lower and more equally distributed across all steps with step 2 most likely 
being the single biggest contributor. 

e The importance of step 2 for the statistical analyses that build on it must be 
stressed. Moreover, this step will be even more important when external data 
sets are employed. In this case, it will also be necessary to establish compatibi- 
lity with the internal data set. 

e Step 7: Once a new rating system has been developed and validated, it will be 
important to assess the impact of a change to the new system on key internal and 
regulatory portfolio risk measures, including for example, expected loss or 
regulatory and economic capital. 

e Regarding step 8 we found it very helpful and time saving to transfer a number of 
the results from statistical analyses to appendices that are automatically gene- 
rated by software tools. 


Finally, we want to conclude the introduction with some comments on step 1, the 
deployment of software tools. The objective should be to automate the complex 
rating development process as completely as possible through all the necessary 
steps, in order to reduce the manpower and a-priori know how required to conduct 


“In this article, the term “external data sets” or “external data” will always refer to a situation 
where — additional to internally rated companies — a typically much larger sample of not internally- 
rated companies is employed for rating development. This external data set will often come from 
an external data provider such as e.g. Bureau van Dijk but can also be the master sample of a data- 
pooling initiative. In such a situation, usually only quantitative risk factors will be available for 
both, the internal and the external data set while qualitative risk factors tend to be confined to the 
internal data set. In this situation, a number of specific problems arise that have to be taken into 
account. The problems we found most relevant will be dealt with in this article. 
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a development project. Therefore, different, inter-connected tools are needed, 
including: 


e Datamarts: Standardised reports from the bank’s operating systems or data 
warehouse covering all information relevant for rating development/validation 
on a historical basis 

e Data set management: to make external data compatible with internal data, for 
sample construction, etc. 

e Statistical analysis tools: tailor made for rating development and validation 
purposes. These tools produce documents that can be used for the rating sys- 
tem’s documentation (step 8). These documents comprise all major analyses as 
well as all relevant parameters for the new rating algorithm. 

e Generic rating algorithm tool: Allows the application of new rating algorithms 
to the relevant samples. It should be possible to customise the tool with the 
results from the statistical analyses and to build completely new types of rating 
algorithms. 


4.2 Calibration of External Ratings 


4.2.1 Introduction 


The first step in building an SRA model is to calibrate the external agencies’ rating 
grades, i.e. to attach a PD to them. The following list summarises the issues we 
found important in this context: 


e External rating types: which types of ratings should be employed? 
— Probability of default (PD)/Expected loss (EL) ratings, 
— Long-/Short-term ratings, 
— Foreign/Local currency ratings 

e External rating agencies: pros and cons of the different agencies’ ratings with 
respect to the shadow rating approach 

¢ Default definition/Default rates: differences between external and internal defi- 
nitions of the default event and of default rates will be discussed 

e Samples for external PD estimation: which time period should be included, are 
there certain obligor types that should be excluded? 

e PD estimation technique: discussion of the pros and cons of the two major 
approaches, the cohort and the duration-based approach 

e Adjustments of PD estimates: if PD estimates do not have the desired properties 
(e.g. monotonicity in rating grades), some adjustments are required 

e Point-in-time adjustment: external rating agencies tend to follow a through-the- 
cycle-rating philosophy. If a bank’s internal rating philosophy is point-in-time 
then either 
— The external through-the-cycle ratings must be adjusted to make them 

sensitive to changes in macroeconomic conditions or, 
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— The effects of developing on a through-the-cycle benchmark must be taken 
into account 


The above mentioned issues will be addressed in the following sections. 


4.2.2 External Rating Agencies and Rating Types 


For SRA ratings systems, typically the ratings of the three major ratings agencies — 
Standard & Poors (S&P), Moody’s and Fitch — are employed. Two questions arise: 


1. For each rating agency, which type of rating most closely matches the bank’s 
internal rating definition? 

2. Which rating agencies are particularly well suited for the purpose of SRA 
development? 


Regarding question | issuer credit ratings for S&P and Fitch and issuer ratings 
for Moody’s were found to be most suitable since these ratings assess the obligor 
and not an obligor’s individual security. Moreover, it will usually make sense to 
choose the long-term, local currency versions for all rating agencies and rating 


5 
types.” 
Regarding question 2 the major pro and cons were found to be the following: 


e Length of rating history and track record: S&P and Moody’s dominate Fitch. See 
e.g. Standard and Poor’s (2005), Moody’s (2005), and Fitch (2005). 

e Rating scope: while both S&P and Fitch rate an obligor with respect to its 
probability of default (PD), which is consistent with banks’ internal ratings as 
required by Basel II, Moody’s assesses its expected loss (EL).This conclusion 
draws on the rating agencies’ rating definitions (cf. Standard and Poor’s (2002), 
Moody’s (2004), and Fitch 2006), discussions with rating agency representatives 
and the academic literature (cf. Guttler 2004). 

e Are differences between local and foreign currency ratings (LC and FC) always 
identifiable? While S&P attaches a local and foreign currency rating to almost 
every issuer rating, this is not always the case for Moody’s and Fitch. 


Based on an assessment of these pros and cons it has to be decided whether one 
agency will be preferred when more than one external rating is available for one 
obligor. 

The following sections will deal with PD estimations for external rating grades. 
In this context we will — for the sake of simplicity — focus on the agencies S&P and 
Moody’s. 


Long-term ratings because of the Basel II requirements that banks are expected to use a time 
horizon longer than one year in assigning ratings (BCBS (2004), § 414) and because almost all 
analyses of external ratings are conducted with long-term ratings. Local currency ratings are 
needed when a bank measures transfer risk separately from an obligor’s credit rating. 
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4.2.3 Definitions of the Default Event and Default Rates 


For the PD estimates from external rating data to be consistent with internal PD 
estimates, (a) the definition of the default event and (b) the resulting definition of 
default rates (default counts in relation to obligor counts) must be similar. While 
there might be some minor differences regarding the calculation of default rates,° 
the most important differences in our opinion stem from different definitions of the 
default event. Here are the most important deviations’: 


e Different types of defaults (bank defaults vs. bond market defaults): a company 
that has problems meeting its obligations might e.g., first try to negotiate with its 
bank before exposing it to a potential default in the bond market. 

e Differences in qualitative default criteria: according to Basel II, a company is to 
be classified as default when a bank considers that the obligor is unlikely to pay 
its credit obligations in full. This could easily apply to companies that are in the 
lowest external non-default rating grades. 

e Number of days of delayed payment that will lead to default 
— Basel II: 90 days 
— S&P: default when payments are not made within grace period which 

typically ranges from 10 to 30 days 
— Moody’s: 1 day 

e Materiality: While external agencies will measure defaults without respect to the 
size of the amount due, under Basel II, payment delays that are small with 
respect to the company’s overall exposure will not be counted as defaults. 


In order to assess the effects of these and other differences in default definition 
on estimated PDs, the default measurement of S&P and Moody’s has to be 
compared with the bank’s internal default measurement. In a first step S&P and 
Moody’s could be compared with each other (a) If the differences between the two 
external agencies are not significant, internal defaults can be compared with the 
pooled external defaults of S&P and Moody’s (b) The following technique might be 
useful for steps (a) and (b): 


°Examples: (a) While the external agencies count the number of obligors only at the beginning of 
the year and then the resulting defaults from these obligors over the year, a bank might count on a 
finer basis (e.g., monthly) in order to track as many obligors as possible; (b) defaults that occur 
because of foreign currency controls and not because the individual obligor is not able to meet its 
obligations should not be counted as default for the purpose of PD-estimation if a bank quantifies 
transfer risk separately. 

The Basel II default definition is given in (BCBS (2004), § 452). The rating agencies’ default 
definitions are described in their respective default reports (cf. Standard and Poor’s (2005), 
Moody’s (2005), and Fitch 2005). 

This assessment draws on external agencies’ verbal definitions of those rating grades (cf. 
Standard and Poor’s (2002), Moody’s (2004), and Fitch 2006). 
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1. Estimation of the ratio of Moody’s defaults for each S&P default and the ratio of 
external defaults for each internal default respectively. 

2. This ratio can be interpreted as an adjustment factor with which (a) PDs derived 
for Moody’s have to be scaled in order to arrive at PDs compatible with S&P and 
(b) with which external PDs have to be adjusted in order to be comparable with 
internally derived PDs. 

3. Calculation of confidence intervals for the resulting estimators using a multino- 
mial model and a Chi-square-type test statistic” 


Depending on the estimation results it has to be decided whether an adjustment 
factor should be applied. If estimators prove to be very volatile, additional default 
data (e.g. form data pooling initiatives) might be needed to arrive at more confident 
estimates. 


4.2.4 Sample for PD Estimation 


For the estimation of external PDs the obligor samples of S&P and Moody’s as used 
by these agencies to derive default rates in their annual default reports can be 
employed.'° The following two dimensions of sample construction should in our 
opinion be closely analysed: 


1. Obligor sector and country: should all obligor types be included irrespective of 
industry sector and country? 
2. Length of time series 


With respect to (4.1) one can start with the hypotheses that — as ratings agencies 
claim — external ratings are comparable across industry sectors and countries.'! 
Consequently, for those rating types (S&P and Fitch) that aim to measure an 
obligor’s PD, PD estimates would only have to be conditional on an obligor’s 
rating grade, not its industry sector or country. Where ratings measure an obligor’s 
EL for senior unsecured obligations (Moody’s), however, PD estimates would also 
have to be conditional on all obligor characteristics that affect the LGD on these 
obligations, as could — for example — be the case for a company’s industry sector or 
home country. But if LGD differences across obligors are small compared to PD 


For example, for the comparison of external and internal defaults, the multinomial random 
variable would for each defaulted company indicate one of three potential outcomes: (1) External 
and internal default, (2) External default but no internal default, (3) Internal default but no external 
default. Moreover, due to the typically small amount of data, no large-sample approximation but 
the exact Chi-square distribution should be employed. Confidence limits can be estimated by 
applying the test statistic on a sufficiently fine grid for the parameters of the multinomial 
distribution. 

'0See Standard and Poor’s (2005) and Moody’s (2005). 

"See agencies’ rating definitions: Standard and Poor’s (2002) and Moody’s (2004) respectively. 
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differences between rating grades, estimates based only on the rating grade might 
be tolerable for pragmatic reasons. 

To address the first issues (comparability of ratings across countries/sectors), the 
literature on differences between external default rates across industry sectors and 
countries should be reviewed. We found only three papers on the default rate 
issue.'* None identified country specific differences while they were inconclusive 
with respect to sector specific differences." 

Regarding the second issue (relative size of the LGD effect), the bank’s internal 
LGD methodology should be analysed with respect to differences between senior 
unsecured LGDs across industries and countries.'* Based on the assessment of both 
issues it should be decided as to whether country or industry sector specific 
estimates are needed. 

We now turn to the second dimension of sample construction, i.e. the length of 
the time series. On the one hand, a long time series will reduce statistical uncer- 
tainty and include different states of the business cycle. On the other hand, there is 
the problem that because of structural changes, data collected earlier, might not 
reflect current and future business conditions. A sensible starting point will be the 
time horizon that is most often used by both the rating agencies and the academic 
literature (starting with the years 1981 and 1983 respectively). One can then analyse 
changes in rating grade default rates over time and assess whether structural 
changes in the default rate behaviour can be identified or whether most of the 
variability can be explained by business cycle fluctuations. 


4.2.5 PD Estimation Techniques 


Once the sample for PD estimation has been derived, the estimation technique must 
be specified. Typically, the so called cohort method (CM) is applied where the 
number of obligors at the beginning of each year in each rating grade and the 
number of obligors that have defaulted in this year are counted respectively. Both 
figures are then summed over all years within the time horizon. The resulting PD 
estimate is arrived at by dividing the overall number of defaults by the overall 
number of obligors.'° 

The duration-based (DB) approach aims to improve on the cohort-method by 
including information on rating migration in the estimation process. The underlying 


See Ammer and Packer (2000), Cantor and Falkenstein (2001), and Cantor (2004). 

13 Ammer and Packer (2000) found default-rate differences between banks and non-banks. How- 
ever, they pointed out that these differences are most likely attributable to a specific historic event, 
the US Savings and Loans crisis, and should therefore not be extrapolated to future default rates. 
Cantor and Falkenstein (2001), in contrast, found no differences in the default rates of banks and 
non-banks once one controls for macroeconomic effects. 

'4For a discussion of LGD-estimation methods we refer to Chapter VIII of this book. 


'SThis method can be improved on by counting on a monthly or even finer base. 
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idea is to interpret default events as the result of a migration process. In the simplest 
setting where the migration process can be assumed to follow a stationary Markov 
process, a 7-year migration matrix Mr can be derived by applying the one year 
migration matrix Mr T times: 


Mr=M,' (4.1) 
The continuous time analogue of (4.1) is 
M, = Exp(m - t), (4.2) 


where m is the marginal migration matrix, t the time index and Exp(.) the matrix 
exponential." Hence, M; (including in particular 1-year default probabilities) can 
be derived by first estimating m from transition counts and then applying the matrix 
exponential to the estimated marginal transition matrix. A detailed description of 
the duration-based approach (DB) and the cohort method (CM) can be found in 
Schuermann and Hanson (2004). They also state the major differences between CM 
and DB estimates, in particular, that the latter produce PDs that spread more widely 
across the rating scale, i.e. PDs for good rating grades will be much lower and PDs 
for bad ratings will be much higher under DB than under CM. 
Both estimation techniques have their pros and cons: 


e DB makes more use of the available information by also taking into account 
rating migrations. For this reason, the DB method can also produce positive PD 
estimates for the best rating grades where no default observations are available. 

e CM is more transparent and does not rely on as many modelling assumptions as 
the DB method. 


As long as there is no clear-cut empirical evidence on the relative performance 
of both methods, it seems therefore sensible to apply both techniques and compare 
the resulting estimates. However, it is likely that in the future such comparisons will 
become available and therefore it will be helpful to keep an eye on the 
corresponding regulatory and academic discussion. 


4.2.6 Adjustments 


Because the PD estimates resulting from the application of the estimation methods 
as described in the previous section will not always be monotonic (i.e. not always 
will PD estimates for better rating grades be lower than for worse rating grades), the 
estimates have to be adapted in these non-monotonic areas. One option is to regress 


'CThe matrix exponential applies the exponential series to matrices: exp (m) =I + m'/ 
1! + m7/2! +... , where / is the identity matrix. 


46 U. Erlenmaier 


the logarithm of the PD estimates on the rating grades and to check whether the 
interpolations that result for the non-monotonic areas are within confidence limits. 
Here are some comments on the underlying techniques: 


e Regression 

— In order to perform the regression, a metric interpretation has to be given to 
the ordinal rating grades. Plots of PD estimates against rating grades on a 
logarithmic scale suggest that this approach is sensible from a pragmatic 
point of view (cf. Altman and Rijken 2004). 

— It may make sense to weight the regression by the number of observations 
available for each rating grade since the precision of PD estimates is depen- 
dent on it. 

e Confidence intervals (CI) 

— For the cohort approach, confidence intervals can be derived from the 
binomial distribution by assuming independent observations.” 

— It is usually assumed that default observations are correlated because of 
macroeconomic default drivers that affect the default behaviour of different 
obligors. Hence, binomial confidence intervals will be a conservative esti- 
mate (they are tighter then they would be under correlated defaults). CIs 
derived from a Merton style simulation model (cf. Chap. 15 of this book) 
could be the logical next step. 

— In the context of the duration-based method, CIs are typically derived via 
Bootstrap methods (cf. Schuermann and Hanson 2004). These tend to be 
even tighter. The topic of correlated defaults/migrations has to our know- 
ledge not yet been addressed in this context. 


4.2.7 Point-in-Time Adaptation 


In the context of Basel I, a bank’s rating system is supposed to measure an obligor’s 
probability of default (PD) over a specific time horizon (the next T years). In 
practice, the objective of rating systems differs, particularly with respect to: 


1. The time horizon chosen by a bank 
2. Whether PDs are conditional on the state of the business cycle (through-the- 
cycle philosophy, TTC) or not (point-in-time philosophy, PIT) 


While the first point can be taken into account by correspondingly adjusting the 
time horizon for default rate estimation, a bank that follows a PIT approach will have 


For an efficient derivation and implementation of exact confidence limits for the binomial 
distribution see Daly (1992). 
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Table 4.1 Comparison of point-in-time and through-the-cycle rating systems 


Issue Point-in-time (PIT) Through-the-cycle (TTC) 
What does the rating system Unconditional PD PD conditional on the state of 
measure? the business cycle. 


The PD estimate might be 
either conditional on a worst 
case (“bottom of the cycle 
scenario”)* or on an average 
business cycle scenario 


Stability of an obligor’s Pro-cyclical: Rating improves Stable: Rating grades tend to be 
rating grade over the during expansions and unaffected by changes in the 
cycle deteriorates in recessions business cycle 

Stability of a rating grade’s Stable: Unconditional PDs of Pro-cyclical: PDs improve 
unconditional PD ratings grades do not during expansions and 

change. Obligor’s higher deteriorate during 
unconditional PDs during recessions 


recession are accounted for 
by migrations to lower 
rating grades and vice versa 


“This has for example been suggested by a survey of bank rating practices by the Basel Committee’s 
Model Task Force (cf. BCBS 2000). 


to apply PIT-adjustments to the PD estimates derived for external rating grades 
since external rating agencies tend to follow a TTC-approach. '* 

In the remainder of this section we will (a) analyse the effects resulting from the 
development of ratings systems on TTC-PDs and (b) outline a technique for PIT 
adjustments of external rating grades. To address both points, we first summarise 
the most important properties of PIT and TTC rating systems in Table 4.1. These 
properties follow straightforwardly from the above definitions. A detailed discus- 
sion can be found in Heitfield (2004). 

Turning to the first point of investigation, we now list the most important 
consequences when developing a rating system on a TTC-PD benchmark: 


e Pure macroeconomic risk factors that focus on business cycle information will 
explain only the (typically quite small) PIT-part of external ratings and will 
therefore tend to receive very low weights in statistical models. 

e This effect should be less pronounced for “mixed factors” that contain both 
business cycle information and non-business cycle elements, for example bal- 
ance sheet ratios or country ratings. 


A bank that follows a PIT rating approach but has not yet finalised a fully-fledged 
PIT-adaptation of external ratings might therefore manually adjust regression results 


'8The TTC-property of external ratings has been observed in the academic literature (cf. Löffler 
2004) and has also been proved to be significant by our own empirical investigations. It must, 
however, be stressed that in practice rating systems will neither be completely TTC or PIT but 
somewhere in between. 
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in order to attach higher weights to pure business-cycle risk factors. For banks that 
already want to implement a statistically founded PIT-adaptation of external rat- 
ings, the following approach could be considered: 


e Estimation of a classic default prediction model, for example via logistic regres- 
sion (see Chap. 1), with external PDs and business cycle factors (on a regional, 
country or industry level) as risk factors 

e The dependent variable is the company’s default indicator as measured by the 
external ratings agencies’ default definition (or, where available, the bank’s own 
default definition). Accordingly, data from external rating agencies will be 
needed on a single obligor level while for TTC-PD estimation, aggregate obligor 
and default counts are sufficient. 


When estimating such a model, the following challenges are pertinent: 


e Different countries have different macroeconomic indicators that might not be 
comparable. 

e Because estimating separate models for separate countries will not be feasible 
due to data restrictions, it will be important to use indicators that are approxi- 
mately comparable across countries. 

e To get a picture of the related effects, it might be sensible to start by building a 
model for the US (where data availability is high) and see how parameter 
estimates change when other countries are added. Probably separate regional 
models can help. 


An alternative approach would be to use external point-in-time rating systems 
for the PIT-adaptation of through-the-cycle agency ratings. An example of a point- 
in-time external rating is Moody’s KMV’s EDF credit risk measure that builds on a 
Merton style causal default prediction model.'? Analysis is then required as to 
whether it would not be better to skip the through-the-cycle agency ratings alto- 
gether and replace them with the external point-in-time ratings. In deciding on 
which approach to take, a bank must trade off the associated costs with the 
availability of the respective benchmarks.”” 


4.3 Sample Construction for the SRA Model 


4.3.1 Introduction 


Once external PDs have been calibrated, and all internal and external data required 
for the development of the SRA model have been compiled, it is necessary to 


See http://www.moodyskmv.com/. 


0For example, market-based measures such as Moody’s KMV’s EDF are only available for public 
companies. 
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construct samples from this data. As we will see, different samples will be needed 
for different types of statistical analysis. In this section we mention these analysis 
techniques in order to map them to the corresponding samples. The techniques will 
be described in Sects. 4.4 and 4.5. In this section, the following issues will be dealt 
with: 


e Which types of samples are needed? 

e How can these samples be constructed? 

e Weighted observations: If the information content of different observations 
differs significantly, it might be necessary to allow for this by attaching different 
weights to each observation. 

e Correlated observations: We discuss the correlation structure that may result 
from the described sample construction technique and discuss the consequences. 


It should be noted that some parts of the sample construction approach described 
in this section might be too time consuming for an initial development project. 
Nevertheless, it can serve as a benchmark for simpler methods of sample construc- 
tion and could be gradually implemented during future refinements of the initial 
model. 


4.3.2 Sample Types 


The samples relevant for the development of SRA rating systems can be classified 
by the following dimensions: 


e Samples for single (univariate) factor analysis (e.g. univariate discriminatory 
power, transformation of risk factors) vs. multi factor analysis samples (e.g. 
regression analysis, validation) 

e Samples that include only externally rated obligor vs. samples that include 
externally and only internally rated obligors 

e External data vs. internal data”! 

e Development vs. validation sample 


We will start with the first dimension. Univariate analysis investigates the 
properties of each single risk factor separately. Therefore, for this type of analysis 
each change of the one analysed factor will generate a new observation in the data 
set; for the multi factor analysis, each change of any risk factor will produce a new 
observation. This can be taken into account by the following approach to sample 
construction: 


*1Bxternal data are often employed for the development of SRA rating systems in order to increase 
the number of obligors and the number of points in time available for each obligor. See Sect. 4.1 
for more details. 
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1. Risk factors are divided in different categories. All factors for which changes are 
triggered by the same event are summarised into the same risk factor category.” 

2. The samples for the univariate risk factor analysis are constructed separately for 
each category. A complete series of time intervals is build that indicates which 
risk factor combination is valid for the category in each time interval or whether 
no observation was available in the interval. The time intervals are determined 
by the points in time where the risk factors of the category under consideration 
change. This is done separately for each obligor. 

3. All single category samples from step 2 are merged into a new series of time 
intervals. Each interval in the series is defined as the largest interval for which 
the risk factors in each category remain constant. This is done separately for 
each obligor. 


In the following table we give an example comprising two risk factor categories 
(balance sheet data and qualitative factors) and hence two different samples for 
univariate factor analysis. Table 4.2 displays the observations for one single obligor. 

For each of the sample types described above, two sub-types will be needed, one 
that includes only externally rated obligors and one that contains all obligors. The 
first sub-type will be needed e.g. for discriminatory power analysis, the second e.g., 
for risk factor transformation or validation. 

A third dimension is added when external as well as internal data are employed. 
Typically, for SRA models, external data will be used to estimate the quantitative 
model (comprising balance sheet factors as well as macroeconomic indicators) 
while the complete model, consisting of both, quantitative and qualitative risk 
factors will be calculated on the internal data set because qualitative risk factors 
are not available for the external data set. 

A fourth dimension comes with the need to distinguish between development and 
validation samples. Moreover, validation should not only rely on the external PD 
but should also include default indicator information, i.e. the information whether 


Table 4.2 Stylised example for different samples and observations involved in rating develop- 
ment 


Sample Trigger ID Valid from Valid until 
Balance sheet data Accounts 1 Jan 03 Dec 03 
2 Jan 04 Dec 04 
Qualitative factors Internal rating 1 May 03 March 04 
2 April 04 Dec 04 
Multi factor (merged) Accounts 1 Jan 03 April 03 
Internal rating 2 May 03 Dec 03 
Accounts 3 Jan 04 March 04 
Internal rating 4 April 04 Dec 04 


?2One category might for example include all balance sheet factors (triggered by the release of a 
company’s accounts). Another category will be qualitative factors as assessed by the bank’s loan 
manger. They are triggered by the internal rating event. A third category might be macroeconomic 
indicators. 
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a company has or has not defaulted within a specific period of time after its rating 

has been compiled. 

When validating with respect to the default indicator, the need for the separation 
of development and validation samples is not so pressing since the benchmarks 
employed for development and validation are different. Due to the typical scarcity 
of internal default data (the rationale for the SRA approach), it is sensible to 
perform this type of validation on the complete internal data set. 

However, when validating with respect to external PDs, a separation between 
development and validation sample is desirable. If the quantitative model has been 
developed on external data, the internal data set should typically be an appropriate 
validation sample.” For the validation of the complete model, — depending on the 
number of observations available relative to the number of risk factors, the follow- 
ing options can be considered: 

e Constructing two completely different samples (preferably out-of-time**) 

e Developing on the complete internal sample and validating on a subset of this 
sample, e.g. the most recent observations for each obligor or some randomly 
drawn sub-sample 

e Application of bootstrap methods” 


Summarising the issues raised in this section, Table 4.3 gives a simple example 
of the different samples involved in SRA rating development and the types of 
statistical analysis performed on these samples. For simplicity, our example com- 
prises only two input categories of which only one (balance sheet data) is available 
for the external and the internal data set and the other (qualitative factors) is only 
available for the internal data set. 


4.3.3 External PDs and Default Indicator 


For those samples consisting only of externally rated obligors (EX) and for those 
samples that are employed for validation on the default indicator (VAL-D), an 
external PD or the default indicator have to be attached to each line of input variables 
respectively. At least two different approaches to achieve this can be considered: 


Note that the external sample will typically also include some or almost all internal obligors. To 
construct two completely different sets, internal obligors would have to be excluded from the 
external data. However, if the external data set is much larger than the internal data set, such 
exclusion might not be judged necessary. 


*4<Out-of-time” means that development and validation are based on disjoint time intervals. 


For an application of bootstrap methods in the context of rating validation see Appasamy et al. 
(2004). A good introduction to and overview over bootstrap methods can be found in Davison and 
Hinkley (1997). 
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Table 4.3 Stylised example for the different samples and corresponding types of analysis that are 
needed for the development of SRA type rating systems 


ID* Input categories Sample type” Type of analysis 


El Balance sheet data SC EX DEV Representativeness, Fillers for missing 
values, Univariate discriminatory 
power, Estimation of the quantitative 
multi factor model® 

Ila Balance sheet data SC ALL DEV Representativeness, Truncation and 
standardisation of risk factors, Fillers 
for missing values 

Ilb EX VAL-E/DEV Univariate discriminatory power, 
Validation of the quantitative multi 
factor model developed on sample E1 

I2a Qualitative factors SC ALL DEV Standardisation of risk factors, Fillers for 
missing values 

12b EX Score calculation, Univariate 
discriminatory power 

Ia Balance sheet data M EX Risk factor correlations/multicollinearity, 

and qualitative Estimation of the complete multi 
factors factor model (quantitative and 
qualitative) 

13b ALL DEV/VAL-D Risk factor correlations/multicollinearity, 
default indicator validation of the 
complete multi factor model 
developed on sample I3a 

14 EX VAL-E Separate validation sample, for example 
most recent observations for all 
obligors from sample I3a or a 
randomly drawn sub-sample 


ĉE denotes external and I denotes internal data. 

>We write SC for single-category samples and M for merged samples. ALL and EX are standing 
for “all obligors” and “only externally rated obligors” respectively. DEV denotes development 
sample, VAL-E and VAL-D denote validation samples where validation is performed on external 
PDs and on the default indicator respectively. 

“Note that in this case it is not necessary to merge different single-factor samples in order to 
perform the multi-factor analysis, because only one input-category exists. Moreover, a separate 
validation sample for the external data is not necessary since validation is performed on the 
internal data set. 


1. External PDs/the default indicator are treated as yet another risk factor category, 
i.e. a series of time intervals is constructed for each external rating agency/for 
the default indicator indicating the time spans for which a specific external 
rating/default indicator realisation had been valid. These intervals are then 
merged with the relevant single factor or merged factor samples in the same 
way as single factor samples are merged with each other.”° If there are competing 


?6Note that the time intervals of input factors and default indicator are shifted against each other by 
the length of the time horizon for which the rating system is developed. For example, if the horizon 
is 1 year and the default indicator is equal to zero from Jan 2003 to Dec 2004 then this value will be 
mapped to the risk-factor interval from Jan 2002 to Dec 2003. 
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PDs from different external agencies at the same time, an aggregation rule will be 
applied. We will discuss this rule in the second part of this section. 

2. For each risk factor time interval, a weighted average is determined for each 
external agency PD and for the default indicator respectively. The weights are 
chosen proportionally to the length of the time interval for which the external 
rating/the default indicator has been valid. As under 1), an aggregation rule is 
applied to translate the PDs of different external agencies into one single 
external PD. 


For the default indicator the first approach seems to be more adequate, since with 
the second approach the 0/1 indicator variable would be transformed into a contin- 
uous variable on the interval [0,1] and many important analytical tools (e.g. the 
ROC curve) would not be directly applicable. 

This argument, obviously does not apply to the external PDs since they are 
already measured on the interval [0,1]. Moreover, external PDs tend to change more 
frequently than the default indicator and hence the number of observations would 
increase markedly compared to the corresponding risk factor samples. Additionally, 
the PDs of not only one but three different rating agencies would have to be merged, 
further increasing the number of observations. Since the information content of 
different observations belonging to the same risk factor combination will tend to 
differ only slightly, such a procedure will produce many highly correlated observa- 
tions which is not desirable (see Sect. 4.3.5). Consequently the second approach 
appears to be more adequate for external PDs. 

As mentioned above, an aggregation rule has to be devised for cases where more 
than one external rating is valid at some point in time. The most straightforward 
choice will be weighted averages of the different external PDs with a preferential 
treatment of those rating agencies that are assessed to be most suitable for SRA 
development (see Sect. 4.2.2). 


4.3.4 Weighting Observations 


The information content of a single observation in the different samples depends on 
the length of the time interval it is associated with. If, for example, a particular 
balance sheet B is valid from Jan 04 to Dec 04 and we observe two corresponding 
sets of qualitative factors, Q1 (valid until Feb 04) followed by Q2 (valid from 
Feb 04 until Dec 04) we would obviously like to put a much higher weight on the 
observation (B, Q2) than on (B, Q1). 

The most straightforward way is to choose weights that are proportional to the 
length of the time interval associated with a specific observation. In this context, the 
following issues are of particular interest: 


e Stochastic interpretation of weighted observations: The weight attached is a 
measure for the size of the error term associated with each observation, i.e. its 
standard deviation: the lower the weight, the higher the standard deviation. 
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e Practical implementation: Most statistics software packages include options to 
perform statistical computations with weighted observations. This usually 
applies for all techniques mentioned in this article. 


4.3.5 Correlated Observations 


Correlated observations (or, more precisely, correlated error terms) are a general 
problem in single and multi factor analysis. Basic techniques assume independence. 
Using these techniques with correlated observations will affect the validity of 
statistical tests and confidence intervals, probably also reducing the efficiency of 
estimators. To resolve this problem, information about the structure of the correla- 
tions is necessary. In this article, the correlation issue will be dealt with in two steps: 


1. In this section we will address the specific correlations structure that may arise 
from the method of sample construction described above 

2. In Sect. 4.5.3 we will analyse the statistical techniques that can be used to 
address this or other correlation structures in the context of multi factor analysis. 


When constructing samples according to the method described above, the degree 
of correlation in the data will rise when the time intervals associated with each 
observation become smaller. It will also depend on the frequency and intensity of 
changes in the risk factor and the external rating information employed. It is worth 
noting that the resulting type of correlation structure can be best described within a 
panel data setting where the correlations within the time series observations for 
each single obligor will be different to the cross-sectional correlation between two 
obligors. Cross-sectional correlations in SRA development may result from country 
or industry sector dependencies. Time series correlations will typically be due to the 
fact that there are structural similarities in the relationship between a single 
company’s risk factors and its external rating over time. Since models for cross- 
sectional correlations are widely applied in credit portfolio models,’ we will focus 
on time series correlations in this article. 

In what follows we propose some options for dealing with correlations in the 
time series parts. The options are listed in order of rising complexity: 


e For simplicity, basic statistical techniques are employed that do not account for 
correlated error terms. With this option, as much correlation as possible can be 
eliminated by dropping observations with small weights. If all observations have 
approximately the same weight, a sub-sample can be drawn. Here, the appropri- 
ate balance has to be found between losing too much information in the sample 
and retaining a degree of correlation that still appears to be compatible with not 
modelling these correlations explicitly. In any case, the remaining correlation in 


?7See Erlenmaier (2001). 
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the data should be measured and the modeller should be aware of the resulting 
consequences, in particular with respect to confidence intervals (they will tend to 
be too narrow) and with respect to statistical tests (they will tend to be too 
conservative, rejecting the null too often). 

Simple models of autocorrelation in the time series data are employed, the most 
obvious being a first order autoregressive process (AR1) for the time series error 
terms. Of course, higher order AR processes or more complex correlation 
models might also be considered appropriate.” 

A continuous time model for the relation between risk factors and external 
ratings is built (e.g. Brownian motion or Poison process type models) and the 
resulting correlation structure of the discrete observations’ error terms is derived 
from this model. This of course is the most complex option and will most 
probably be seen as too time consuming to be applied by most practitioners. 
It might, however, be a road for academic researchers that in turn could make the 
method available for practitioners in the future. 


4.4 Univariate Risk Factor Analysis 


4.4.1 Introduction 


Before building a multi factor model, each risk factor has to be analysed separately 
in order to determine whether and in which form it should enter the multi factor 
model. This type of analysis is referred to as univariate risk factor analysis. The 
following issues should be dealt with in this context: 


Measurement of a risk factor’s univariate discriminatory power 
Transformation of risk factors to (a) improve their linear correlation — as 
assumed by the multi factor regression model — with the log external PD”? or 
(b) to make different risk factors comparable with each other 

Checking whether the samples on which the rating system is developed are 
representative for the samples to which the rating system will be applied 
(development vs. “target” sample) 

Treatment of missing values 


Each of these issues will be dealt with separately in the following sections. 


?8For an introduction to such models and further references see Greene (2003). 


See Sect. 4.5.2. Throughout this article we will use the term “log external PD” to denote the 
natural logarithm of the PD of an obligor’s external rating grade. How PDs are derived for each 
external rating grade has been described in Sect. 4.2. 
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4.4.2 Discriminatory Power 


A rating system is defined as having a high discriminatory power if good rating 
grades have a comparatively low share of obligors that will default later on and vice 
versa. Accordingly, its discriminatory power will deteriorate with an increase in the 
relative share of later on defaulted obligors in good rating grades. There are several 
statistical measures for this important attribute of a rating system, the Gini coeffi- 
cient being the most popular.*° 

Due to the lack of a sufficient number of default observations in SRA models, 
these types of discriminatory power measurement will usually only be applied as an 
additional validation measure. In the development stage, discriminatory power will 
be defined in terms of the usefulness of the rating system or — in the context of 
univariate factor analysis — a single risk factor in predicting an obligor’s external 
PD: The better a rating system or a risk factor can be used to predict an obligor’s 
external PD, the higher its discriminator power for the SRA approach.*! 

The following techniques can be helpful to measure a risk factor’s discrimina- 
tory power for the SRA approach: 


e Linear and rank-order correlations of the risk factors with the log external PD? 
e Bucket plots 


While the correlation measures are straightforward, the bucket plots require 
further comment. The underlying rationale for applying bucket plots is to visualise 
the complete functional form of the relationship between the risk factor and the 
external PD — in contrast to the correlation measures that aggregate this information 
into a single number. This is done to make sure that the risk factors indeed display 
an approximately linear relationship with external PDs as is required by the multi 
factor model. Bucket plots for continuous risk factors can for example be con- 
structed in the following way: 


e Each risk factor range was divided into n separate buckets, where we chose the 0, 
1/n, 2/n,..., (n-1)/n, 1 quantiles of each risk factor’s distribution as interval 
boarders. 


30For an overview on measures of discriminatory power see Deutsche Bundesbank (2003) or 
Chap. 13. 


31A good discriminatory power of the internal rating system in terms of predicting external ratings 
and a good discriminatory power of the external ratings in terms of predicting future defaults will 
then establish a good discriminatory power of the internal rating system in terms of predicting 
future defaults. 


>?1 inear correlations are typically termed Pearson correlations while rank-order correlations are 
associated with Spearman. Linear correlations are important since they measure the degree of 
linear relationship which corresponds with the linear model employed for the multi-factor analy- 
sis. Rank-order correlations can be compared with linear correlations in order to identify potential 
scope for risk factor transformation. 
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e For each bucket we calculated the average associated external PD. By construct- 
ing the bucket boarders using quantiles it can be made sure that each interval 
contains the same number of observations. 

e The number n of intervals has to be chosen with regard to the overall number of 
PD observations available for each risk factor: with increasing n it will be 
possible to observe the functional form of the relationship on an ever finer 
scale. However, the precision of the associated PD estimates for each bucket 
will decrease and their volatility will increase. 

e In order to quantify the degree of uncertainty, confidence intervals for the PD 
estimates of each bucket can be calculated. 

e The resulting PD estimates and confidence intervals are then plotted against the 
mean risk factor value of each bucket. If a logarithmic scale is used for the PD 
axis, an approximately linear relationship should result when the risk factor has 
been appropriately transformed. Figure 4.1 shows an example of a bucket plot 
for a continuous risk factor. 


Bucket plots for discrete risk factors can be devised according to the same 
method as described above with only one difference: for discrete factors, each 
realisation should represent one bucket irrespective of the number of observations 
available. 
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Fig. 4.1 Example of a bucket plot. It illustrates the functional relationship between a risk factor 
and corresponding external PDs where the latter are measured on a logarithmic scale. The 
relationship on this scale should be approximately linear 
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4.4.3. Transformation 


The following types of transformation typical for the development of rating models 
will be considered in this section: 


e Truncation 

e Other non-linear transformations of continuous risk factors (e.g. taking a risk 
factor’s logarithm) 

e Attaching a score to discrete risk factors 

e Standardisation, i.e. a linear transformation in order to achieve the same mean 
and standard deviation for each risk factor 


We will discuss each of these types of transformations in turn. Truncation means 
that continuous risk factors will be cut off at some point on the left and right, more 
precisely, 


Xtrunc = min{ xn, max{x;, x}} 


where x, is the upper and x; the lower border at which the risk factor x is truncated. 
Note that the truncation function described above can be smoothed by applying a 
logit-type transformation instead. Truncation is done mainly for the following reasons: 


e To reduce the impact of outliers and to concentrate the analysis on a risk factor’s 
typical range** 
e To reduce a risk factor to the range on which it has discriminatory power 


Other types of non-linear transformations are typically applied to continuous risk 
factors to achieve an approximately linear relationship with the log external PD. An 
overview of methods to achieve linearity can be found in Chap. 2. These methods 
will therefore not be discussed here. 

In contrast to continuous risk factors, discrete factors (such as qualitative 
information about the obligor, e.g. its quality of management or competitive 
position) do not have an a priori metric interpretation. Therefore, a score has to 
be attached to each of the discrete risk factor’s potential realisations (e.g., excellent, 
good, medium or poor quality management). As with the non-linear transformation 
for the continuous risk factors, the scores have to be chosen in such a way as to 
achieve the linear relationship of risk factors with log PDs. This can typically be 
achieved by calculating the mean external PD for each risk factor realisation and 
then applying the logarithm to arrive at the final score. 

However, the resulting scores will not always be monotonic in the underlying 
risk factor (i.e., the average PD may not always decrease when the assessment with 
respect to this risk factor improves). In such cases it has to be decided whether the 
effect is within statistical confidence levels or indeed indicates a problem with the 


33This is often necessary for sensible visualization of the risk factor’s distribution. 
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underlying risk factor. If the first holds true (typically for risk factor realisations 
where only very few observations are available), interpolation techniques can be 
applied augmented by expert judgements. In the second case, depending on the 
severity of the effects identified, it may be necessary (a) to analyse the reasons for 
this effect, or (b) to merge different realisations of the risk factor to a single score, 
or (c) to eliminate the risk factor from subsequent analysis. 

All transformations that have been described up to now have been performed in 
order to improve the risk factor’s linear correlation with log external PDs. The 
remaining transformation (standardisation) has a linear functional form and will 
therefore not alter linear correlations. It is performed in order to unify the different 
risk factor’s scales and, accordingly, improve their comparability, primarily in the 
following two respects: 


e How good or bad is a risk factor realisation compared with the portfolio average? 
e Interpretability of the coefficients resulting from the linear regression as weights 
for the influence of one particular risk factor on the rating result 


Typically, the risk factors are standardised to the same mean and standard 
deviation. This transformation only makes sure that the risk factors are comparable 
with respect to the first and second moment of the distribution. Perfect comparabi- 
lity will only be achieved when all moments of the standardised risk factor’s 
distribution will be roughly the same, i.e. if they follow a similar probability 
distribution. This will typically not be the case, in particular since there are risk 
factors with continuous and discrete distributions respectively. However, some 
degree of overall distributional similarity should be achieved by the need to 
establish an approximately linear relationship between each risk factor and the 
log external PD. Moreover, we will comment on the rationale of and the potential 
problems with the interpretation of regression estimates as weights of influence in 
Sect. 4.5.4 where we deal with multi factor analysis. 


4.4.4 Representativeness 


Representativeness, while important for other types of rating systems, should be 
treated with particular care when developing SRA rating systems. ** The following 
two types of comparisons are of specific interest: 


e Comparison of the internal samples types IE (including only externally rated 
obligors) and IA (comprising all internal obligors) with each other. 


34An SRA-rating system will always face the problem that — due to the relative rareness of default 
data — it is difficult to validate it for obligors that are not externally rated. While some validation 
techniques are available (see Sect. 4.5.8), showing that the data for externally rated obligors is 
comparable with that of non-externally rated obligors will be one of the major steps to make sure 
that the derived rating system will not only perform well for the former but also for the latter. 
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This comparison is necessary since SRA rating systems are developed on samples 
that include only externally rated obligors but are also applied to obligors without 
external ratings. 

e Comparison of the external data set (E) with the internal data set IA. This 
comparison arises from the need to increase the available number of observa- 
tions for rating development by including external data. 


Representativeness can be analysed by comparing the distribution of the risk 
factors and some other key factors (such as countries/regions, industry sectors, 
company type, obligor size, etc.) on each sample. In this context frequency plots 
(for continuous factors, see Fig. 4.2) and tables ordered by the frequency of each 
realisation (for discrete factors) can be particularly useful. 

These tools can be supplemented with basic descriptive statistics (e.g. difference 
of the medians of both samples relative to their standard deviation or the ratio of the 
standard deviations on both samples). Formal statistical tests on the identity of 
distributions across samples were not found to be useful since the question is not 
whether distributions are identical (typically they are not) but whether they are 
sufficiently similar for the extrapolation of results and estimates derived on one 
sample to the other sample. 


External Data Internal Data 


20.0 


0 
-0.3 -0.2 -0.1 0 0.1 0.2 -0.3 -0.2 -0.1 0 0.1 0.2 
Risk Factor 


Fig. 4.2 Example for a frequency plot that compares a risk factor’s distribution on the external 
data set with its distribution on the internal data set 
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What can be done when data is found to be unrepresentative? 


e First, it has to be ascertained whether the problem occurs only for a few risk 
factors/key figures or for the majority. 

e In the first case, the reasons for the differences have to be analysed and the 
development samples adjusted accordingly. One reason, for example, might be 
that the distribution of obligors across regions or industry sectors is extremely 
different. The development sample can then be adjusted by reducing the amount 
of obligors in those regions/industry sectors that are over-represented in the 
development sample. 

e Inthe second case, a variety of approaches can be considered, depending on the 
specific situation. Examples include: 

— The range of the risk factors can be reduced so that it only includes areas that 
are observable on both the development and the target sample. 

— The weight of a risk factor found to be insufficiently representative can be 
reduced manually or it can be excluded from the analysis. 


4.4.5 Missing Values 


A missing value analysis typically includes the following steps: 


e Decision as to whether a risk factor will be classified as missing for a particular 
observation 

e Calculation of fillers for missing values/exclusion of observations with missing 
values 


While for some risk factors such as qualitative assessments (e.g., management 
quality), the first issue can be decided immediately, it is not always that clear-cut for 
quantitative risk factors such as balance sheet ratios that may be calculated from a 
number of different single positions. Typical examples are balance sheet ratios that 
include a company’s cash flow that in turn is the sum of various single balance sheet 
items. 

The problem — typically arising on the external data set — is that for a large 
proportion of observations at least one of these items will be missing. Hence, in a 
first step the relative sizes of the balance sheet items have to be compared with each 
other and based on this comparison, rules must be devised as to which combination 
of missing values will trigger the overall position to be classified as missing: if 
components with a large absolute size are missing, the risk factors should be set to 
missing; if not, the aggregate position can be calculated by either omitting the 
missing items or using fillers which, however, should be chosen conditional on the 
size of the largest components. 

We now come back to the second issue raised at the beginning of this section, i.e., 
the calculation of fillers for missing values on the risk factor level. It is, of course, 
related to the issue of calculating fillers on the component level. However, the need to 
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employ conditional estimates is not so severe. Typically, there will be quite a lot of 
risk factors that are correlated with each other. Hence, making estimates for missing 
values of one risk factor conditional on other risk factors should produce more 
accurate fillers. However, it will also be time consuming. Therefore, in practice, 
only some very simple bits of information will typically be used for conditioning, 
e.g., the portfolio to which an obligor belongs (external or internal data set). 

Moreover, different quantiles of the distribution might be employed for the 
calculation of fillers on the external and internal data set respectively. For the 
external sample, a missing value may not constitute a significant negative signal 
in itself. For the internal sample, on the other hand, missing values usually are 
negative signals, since a company could be expected to provide to the bank the 
information it needs to complete its internal rating assessment. Therefore, missing 
values on the internal sample will typically be substituted by more conservative 
quantiles than missing values on the external data set. 

Finally, depending on the relative frequency of missing values in the sample, it 
might be necessary to exclude some observations with missing values to avoid 
biases in statistical estimates. 


4.4.6 Summary 


Concluding this section we want to summarise the techniques that we have pre- 
sented for univariate risk factor analysis and map them to the samples on which they 
should be performed. Since we have already dealt with the sample issue in Sect. 4.3, 
here we will focus on those two sample dimensions that we think are most 
important for univariate factor analysis, i.e. externally rated obligors versus all 
obligors and external versus internal data set. As in Sect. 4.4.4 we use the following 
shortcuts for these sample types: 


e E: External data set, only externally rated obligors, 
e JE: Internal data set, only externally rated obligors. 
e JA: Internal data set, all obligors, 


The univariate analysis techniques and corresponding sample types are sum- 
marised in Table 4.4. 


4.5 Multi-factor Model and Validation 


4.5.1 Introduction 


Once the univariate analysis described in Sect. 4.4 has been completed, the multi- 
factor model has to be estimated and the estimation results communicated, adjusted 
(if necessary), and validated. These issues will be dealt with in Sect. 4.5 in this order: 
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Table 4.4 Univariate analysis techniques and corresponding sample types 


Type of univariate Sample Description 
analysis type 


Factor transformation JE,IA* (a) Truncation 
(b) other non-linear transformations of continuous risk factors 
(e.g., taking a risk factor’s logarithm) 
(c) calculating scores for discrete risk factors 
(d) standardisation: linear transformation in order to achieve 
the same median (mean) and standard deviation for all risk 


factors 
Discriminatory power E,IE (a) Correlation (rank order and linear) with external PD 
(b) Bucket plots 
Representativeness IEA (a) Comparison of internal samples with each other (IE and IA) 
E,IA (b) Comparison of external sample (E) with internal sample (IA) 
Missing values EIA Fillers for missing values in the external and internal samples 
respectively 


“TE is only needed to derive the scores for the qualitative risk factors. All other types of analysis are 
performed on IA. 


e Model selection: which type of model is chosen and which risk factors will enter 
the model? 

e Model assumptions: Statistical models typically come with quite a few model- 
ling assumptions that guarantee that estimation results are efficient and valid. 
Therefore, it has to be analysed whether the most important assumptions of the 
selected model are valid for the data and if not, how any violations of modelling 
assumptions can be dealt with. 

e Measuring the influence of risk factors: We will discuss how the relative influence 
of single risk factors on the rating result can be expressed in terms of weights to 
facilitate the interpretation of the estimated model. In a second step, we comment 
on the problems associated with the calculation and interpretation of these weights. 

e Manual adjustments and calibration: We discuss the rationale and the most 
important issues that must be dealt with when model estimates are adjusted 
manually and describe how the resulting model can be calibrated. 

e Two-step regression: It is briefly noted that with external data the regression 
model will typically have to be estimated in two steps. 

e Corporate groups and government support: We propose a simple method to 
produce an empirical estimate for the optimal absolute influence of supporters on 
an obligor’s rating. 

e Validation: We briefly itemise the validation measures that we found most 
useful for a short-cut validation in the context of rating development. 


4.5.2 Model Selection 


The issue of model selection primarily has two dimensions. First, the model type 
has to be chosen and then it has to be decided which risk factors will be included in 
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the model. Regarding the first question the most simple and most frequently used 
model in multi factor analysis is linear regression. A typical linear regression 
models for SRA type rating systems will have the following form”: 


Log(PD;) = bo + bixi fee + DinXim + £i (i= 1,...,7), (4.3) 


where PD; denotes the external PD, x;; the value of risk factor j, ¢; the regression 
model’s error term for observation i, and bo,..., bm are the regression coefficients 
that must be estimated from the data. Note that each observation i describes a 
specific firm over a specific time span. 

Risk factors are regressed on log PDs because on the one hand, this scale is 
typically most compatible with the linear relationship assumed by the regression 
model and on the other hand, because internal master scales that translate PDs into 
rating grades, are often logarithmic in PDs. 

We now turn to the second issue in this section, the selection of those risk factors 
that will constitute the final regression model employed for the rating system. The 
following types of analysis are useful for risk factor selection: 


e Univariate discriminatory power (on internal and external data set) 
e Representativeness 

e Correlations/multicollinearity between risk factors 

e Formal model selection tools 


We have already dealt with the issues of discriminatory power and representa- 
tiveness in Sect. 4.4. For correlations between risk factors and multicollinearity we 
refer the reader to Chap. 2. In this section we will add some comments on typical 
formal model selection tools in the context of linear regression: 


e Formal model selection tools are no substitute for a careful single factor and 
correlation analysis. 

e There are quite a variety of formal model selection methods.*° We found the R? 
maximisation method that finds the model with the best R? for each given 
number of risk factors particularly useful for the following reasons: 

— It allows to trade off the reduction in multicollinearity against the associated 
loss in the model’s R° on the development sample. 

— The R? measure is consistent with the linear correlation measure employed in 
the single factor analysis.” 


35Throughout this article Log denotes the natural logarithm with base e. 
36For reviews on formal model-selection methods see Hocking (1976) or Judge et al. (1980). 


37R? is the square of the linear correlation between the dependent variable (the log external PD) 
and the model prediction for this variable. 
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4.5.3 Model Assumptions 


Three crucial stochastic assumptions about the error terms £ constitute the basis of 
linear regression models**: 


e Normal distribution (of error terms) 
e Independence (of all error terms from each other) 
e Homoscedasticity (all error terms have the same standard deviation) 


For all three issues there are a variety of statistical tests (e.g., Greene 2003). 
If these tests reject the above hypotheses, it is up to the modeller to decide on the 
severity of these effects, i.e., whether they can be accepted from a practical point of 
view or not. 

As for normality, looking at distribution plots of the residuals?’ we found that 
they often came very close to a normal distribution even in cases where statistical 
tests reject this hypothesis. Moreover, even under the violation of the normality 
assumption, estimators are still efficient (or, more precisely, BLUE).*° Only the 
related statistical tests and confidence intervals are no longer valid. But even here 
convergence is achieved for large sample size. 

Violations of the two other assumptions (independence and homoscedasticity) 
tend to be more severe. They can be summarised as deviations from the regression 
model’s error term covariance matrix which is assumed to have identical values for 
each entry of the diagonal (homoscedasticity) and zeros for each entry that is not on 
the diagonal (independence). 

If statistical tests reject the hypotheses of independence/homoscedasticity, this 
problem can be dealt with when a) plausible assumptions about the structure of the 
covariance matrix can be made and b) when this structure can be described with a 
sufficiently small set of parameters. If this is the case these parameters and hence the 
covariance matrix can be estimated from the data (or, more precisely, from the 
residuals). The least square method employed for parameter estimation in the regres- 
sion model can then be adjusted in such a way that the original desirable properties of 
the ordinary least square estimators (OLS) can be restored. In the literature (e.g., 
Greene 2003) this method is referred to as generalised least square (GLS). 

In order to proceed, hypotheses on the structure of the covariance matrix have to 
be derived. In Sect. 4.3 dealing with sample construction, we have already 
described one possible source of heteroscedasticity™’ and correlation in the data 
respectively. 


38For a comprehensive overview on applied linear regression see Greene (2003). 

Residuals (e) are the typical estimators for the (unobservable) theoretical error terms (£). They 
are defined as the difference between the dependent variable and the model predictions of this 
variable. 

“BLUE stands for best linear unbiased estimator. 

‘|The term heteroscedasticity refers to cases where standard deviations of error terms are different 
as opposed to the assumption of identical standard deviations (homoscedasticity). 
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We argued that the size (i.e., the standard deviation) of the error term might 
sensibly be assumed to be proportional to the length of the time interval to which 
the observation is attached. Hence, we proposed to weight each observation with 
the length of the corresponding time interval. In the context of regression analysis, 
weighting observations exactly means to assume a specific type of heteroscedastic 
covariance matrix and application of the corresponding GLS estimation. 

We also concluded that autocorrelation in the time series part of the data might 
well increase when time intervals become smaller and smaller. One of the simplest 
and most commonly employed structures for correlated error terms assumes an AR 
(1) correlation structure between subsequent error terms: 


&=peitu (t=1,...,T), (4.4) 


where the variables u, are independent of each other. Hence, the issue could be dealt 
with by estimating the parameter p from the data, deriving the correlation matrix 
and applying GLS.*” There is, however, one crucial problem with this procedure: it 
is not logical to assume this correlation structure for the complete data set as would 
be done in a standard time series regression setting. Rather, the rating development 
data set at hand will typically have a panel data structure where the correlation 
structure of the cross section’s error terms (different obligors) will most likely be 
different from the correlation structure of the time series part (different points in 
time for the same obligor). Applying a panel data model with an AR(1) structure in 
the time series part could be a sensible first approximation. Corresponding error 
term models offered by statistics software packages are often of the type 


Eit = Pj€it—1 + Uit (t= Loe, Ft Sh gn) 2 (4.5) 


Note that the AR parameter p is estimated separately for each cross section (i.e. 
firm): p = p;. Therefore, quite a few time series observations are required for each 
single obligor to make confident estimates, which often will not be feasible for 
rating development data. A more practicable model would estimate an average AR 
parameter p for all obligors: 


Eit = p&i t—1 + Uit t= lessi S hesan) (4.6) 


There might be other sources of correlation or heteroscedasticity in the data 
requiring a different structure for the covariance matrix than the one described 
above. If no specific reasons can be thought of from a theoretical point of view, one 
will usually look at residual plots to identify some patterns. Typically, residuals will 
be plotted (a) against the independent variable (log PD in our case), (b) against 
those dependent variables (risk factors) with the highest weights or (c) against some 
other structural variable, such as the length of the time interval associated with each 


4?Indeed, a standard procedure for dealing with autocorrelated error terms in the way described 
above is implemented in most statistical software packages. 
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observation. If effects can be identified, first a parametrical model has to be devised 
and then the associated parameters can be estimated from the residuals. That will 
give arough picture of the severity of the effects and can hence provide the basis for 
the decision as to whether to assess the deviations from the model assumptions as 
acceptable or whether to incorporate these effects into the model — either by 
weighting observation (in the case of heteroscedasticity) or by devising a specific 
correlation model (in the case of deviations from independence). 


4.5.4 Measuring Influence 


Once a specific regression model has been chosen and estimated, one of the most 
important aspects of the model for practitioners will be each risk factor’s influence 
on an obligor’s rating. Hence, a measure of influence has to be chosen that can also 
be used for potential manual adjustments of the derived model. 

To our knowledge, the most widely applied method is to adjust for the typically 
different scales on which the risk factors are measured by multiplying the estimator 
for the risk factor’s coefficient in the regression model by the risk factor’s standard 
deviation and then deriving weights by mapping these adjusted coefficients to the 
interval [0,1] so that the absolute values of all coefficients add up to i 

What is the interpretation of this approach to the calculation of weights? 
It defines the weight of a risk factor x; by the degree to which the log PD predicted 
by the regression model will fluctuate when all other risk factors (x;,),4; are kept 
constant: the more log PD fluctuates, the higher the risk factor’s influence. As a 
measure for the degree of fluctuation, the predictor’s standard deviation is used. 
Hence, the weight w; of a risk factor x; with coefficient b; can be calculated as 

wi = "i 


: (4.7) 


eee nk 
wi F F [w3 


where 
w') = STD{ Log(PD) | (ey } = STD(byx) = HSTDOH), (48) 


and STD denotes the standard deviation operator. 


43Note that this method is also suggested by standard regression outputs. The associated estimates 
are typically termed “standardized coefficients”. Moreover, if the risk factors have already been 
standardized to a common standard deviation — as described in Sect. 4.4 — they already have the 
same scale and coefficients only have to be mapped to [0,1] in order to add up to 1. 


68 U. Erlenmaier 


However, when using this type of influence measure, the following aspects have 
to be taken into account: 


e The standard deviation should be calculated on the internal data set containing 
all obligors, not only the externally rated obligors. 

e The master rating scale will typically be logarithmic in PDs. Therefore, measur- 
ing the risk factor’s influence on predicted log PDs is approximately equivalent 
to measuring its influence on the obligor’s rating. This should usually be what 
practitioners are interested in. However, if the influence on an obligor’s pre- 
dicted PD is to be measured, the above logic will not apply anymore since 
predicted PDs are an exponential function of the risk factor and hence their 
standard deviation cannot be factored in the same fashion as described above. 
Moreover, the standard deviation of the external PD will depend on the realisa- 
tions of the other risk factors (x,),4; that are kept constant. 

e The problems described in the previous point also arise for the log-PD influence 
when risk factors are transformed in a non-linear fashion, e.g. when a risk 
factor’s logarithm is taken. In this case, the above interpretation of influence 
can only be applied to the transformed risk factors which usually have no 
sensible economic interpretation. 

e Also, the above mentioned interpretation does not take into account the risk 
factor’s correlation structure. The correlation between risk factors is usually not 
negligible. In this case the conditional distribution (in particular, the conditional 
standard deviation) of the log-PD predictor, given that the other risk factors are 
constant, will depend on the particular values at which the other risk factors are 
kept constant. 

e Making the risk factor’s distributions comparable only by adjusting for their 
standard deviation might be a crude measure if their distributional forms differ a 
lot (e.g., continuous versus discrete risk factors).** 

e The weights described above measure a risk factor’s average influence over the 
sample. While this may be suitable in the model development stage when 
deciding, e.g., about whether the resulting weights are appropriate, it may not 
be appropriate for practitioners interested in the influence that the risk factors 
have for a specific obligor. Other tools can be applied here, e.g., plotting how a 
change in one risk factor over a specified range will affect an obligor’s rating. 


Despite the above cited theoretical problems standard deviation based measures 
of influence have proved to work quite well in practice. However, there appears to 
be some scope for further research on alternative measures of influence. Moreover, 
it should be noted that, when correlations between risk factors are non-negligible, a 
risk factor’s correlation with predicted log PDs can be quite high, even if the weight 
as defined above is not. We therefore found it important for the interpretation of the 


4# Additionally, the standard deviation tends to be a very unstable statistical measure that can be 
very sensitive to changes in the risk factor’s distribution. However, this problem should be reduced 
significantly by the truncation of the risk factors which reduces the influence of outliers. 
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derived regression model, to evaluate these correlations for all risk factors and 
report them together with the weights. 


4.5.5 Manual Adjustments and Calibration 


There may be quite a variety of rationales for manually adjusting the estimation 
results derived from the statistical model, for instance, expert judgements that deviate 
significantly from those estimations, insufficient empirical basis for specific portfolio 
segments, insufficient representativeness of the development sample, or excessively 
high weights of qualitative as opposed to quantitative risk factors.*° When manual 
adjustments are made, the following subsequent analyses are important: 


1. Ensuring that the ratings system’s discriminatory power is not reduced too much 
2. Re-establishing the calibration that statistical models provide automatically in 
the SRA context 


Regarding the first issue, the standard validation measures — as briefly described 
in Sect. 4.5.8 — will be applied. The second issue can be addressed by regressing the 
score resulting from the manually adjusted weights ),...,@, against log PDs: 


Log(PD;) = co + c1[@1%1 +e H Ont] +e, (= 1,...n). (4.9) 


Note that co and c, are the coefficients that must be estimated in this second 
regression. The parameter co is related to the average PD in the portfolio while cı 
controls the rating system’s implicit discriminatory power, i.e., the degree to which 
predicted PDs vary across the obligors in the portfolio.*° 

The estimates for co and cı will give additional evidence for the degree to which 
the manual adjustments have changed the rating system’s overall properties: If 
changes are not too big, then co should not differ much from bo and cı should be 
close to by = ||bı| +--+ + |bm|] if all risk factors have been standardised to the 
same standard deviation. 

Finally, for each observation i, a PD estimate can be derived from the above 
regression results by the following formulas: 


“With the SRA approach to rating development, there is the problem that the loan manager may 
use qualitative risk factors in order to make internal and external ratings match. If that is the case, 
the relative weight of qualitative factors as estimated by the statistical model will typically be too 
high compared to the weights of quantitative risk factors. The validation measures that are not 
linked to external ratings (see Sect. 4.5.8) and also expert judgement may then help to readjust 
those weights appropriately. 

4°More formally, the implicit discriminatory power is defined as the expected value of the 
(explicit) discriminatory power — as measured by the Gini coefficient (cf. Chap. 13). 


47This can be derived from (4.7) and (4.8). 
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E|PD,|X;] = exp(u; + o;°/2) G = 1.....n), where (4.10a) 
l; = Ellog(PD,;)|X;] = co + ci[@ixa + +++ + OmXim) and (4.10b) 
o; = Var(e,). (4.10c) 


Note that X; denotes the vector of all risk factor realisations for observation i and 
E[.] is the expectation operator. The result is derived from the formula for the mean 
of log-normally distributed random variables.** For the formula to be valid, the 
error terms €; have to be approximately normally distributed which we found 
typically to be the case (see Sect. 4.5.3). Moreover, the most straightforward way 
to estimate g; from the residuals would be to assume homoscedasticity, i.e. o; = o 
(i = 1,...,n). If homoscedasticity cannot be achieved, the estimates for ø; will have 
to be conditional on the structural variables that describe the sources of hetero- 
scedasticity. 


4.5.6 Two-step Regression 


In this section we note that — when external data are employed — it will typically be 
necessary to estimate two models and, therefore, go through the process described 
in the previous sections twice. If, for example, only balance sheet ratios and 
macroeconomic risk factors are available for the external data set, then a first 
quantitative model will have to be estimated on the external data set. As a result, 
a quantitative score and corresponding PD can be calculated from this model that in 
turn can be used as an input factor for the final model. The final model will then 
include the quantitative score as one aggregated independent variable and the 
qualitative risk factors (not available for the external data set) as the other indepen- 
dent variables. 


4.5.7 Corporate Groups and Sovereign Support 


When rating a company, it is very important to take into account the corporate 
group to which the company belongs or probably some kind of government support 
(be it on the federal, state or local government level). This is typically done by 
rating both the obligor on a standalone basis (=standalone rating) and the entity that 


481f X is normally distributed with mean p and standard deviation o, then E [exp (X)] = exp 
(u + 07/2), where E is the expectation operator (Limpert et al. 2001). 
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is supposed to influence the obligor’s rating (—supporter rating).*” The obligor’s 
rating is then usually derived by some type of weighted average of the associated 
PDs. The weight will depend on the degree of influence as assessed by the loan 
manager according to the rating system’s guidelines. 

Due to the huge variety and often idiosyncratic nature of corporate group or 
sovereign support cases, it will be very difficult to statistically derive the correct 
individual weight of each supporter, the average weight, however, could well be 
validated by estimates from the data. More precisely, consider that for the deve- 
lopment sample we have i = 1,...,n obligors with PDs PD,, corresponding suppor- 
ters with PDs PD*and associated supporter weights w; > 0 as derived by the 
rating analyst’s assessment." Then, a regression model with [(1—w,) - PD;] and 
[wi - PDS ] as independent variables and PD¢* (the obligor’s external PD) as depen- 
dent variable can be estimated to determine as to whether the average size of the 
supporter weights w; is appropriate or whether it should be increased or decreased. 


4.5.8 Validation 


The validation of rating systems is discussed at length in Chaps. 12—15. Specific 
validation techniques that are valuable in a low default context (of which SRA 
portfolios are a typical example) are discussed in BCBS (2005) and in Chap. 5. 
During rating development it will typically not be possible to run through a fully- 
fledged validation process. Rather, it will be necessary to concentrate on the most 
important measures. We will therefore briefly itemise those issues that we found 
important for a short-cut validation of SRA rating systems in the context of rating 
development: 


e Validation on external ratings/external PDs 
— Correlations of internal and external PDs (for all modules of the rating 
system?!) 
— Case-wise analysis of those companies with the largest differences between 
internal and external ratings 
— Comparison of average external and internal PDs across the entire portfolio 
and across sub-portfolios (such as regions, rating grades, etc.) 
e Validation on default indicators 
— Gini coefficient (for all modules of the rating system) 


“Note that for the sake of simplicity, the expression “supporter” is used for all entities that 
influence an obligor’s rating, be it in a positive or negative way. 

50The standalone and supporter PDs have of course been derived from the regression model of the 
previous sections, probably, after manual adjustments. 

5!The typical modules of a SRA-rating system (statistical model, expert-guided adjustments, 
corporate-group influence/government support, override) have been discussed in Sect. 4.1. 


72 U. Erlenmaier 


— Comparison of default rates and corresponding confidence intervals with 
average internal PDs. This is done separately for all rating grades and also 
across all rating grades 

— Formal statistical tests of the rating system’s calibration (such as e.g. 
Spiegelhalter, see Chap. 15) 

e Comparison of the new rating system with its predecessor (if available) 

— Comparison of both rating system’s validation results on external ratings and 
the default indicator 

— Case-wise analysis of those companies with the largest differences between 
old and new rating system 


There are also some other validation techniques not yet discussed but that could 
enter a short-cut validation process in the rating development context, in particular 
addressing the relative rareness of default data in SRA portfolios (see BCBS 2005): 


e Using the lowest non-default rating grades as default proxies 

e Comparison of SRA obligors with the obligors from other rating segments that 
have the same rating 

e Estimation of internal PDs with the duration-based approach, i.e. including 
information on rating migration into the internal PD estimation process 

e Data pooling 


4.6 Conclusions 


In this article we have reported on some aspects of the development of shadow 
rating (SRA) systems found to be important for practitioners. The article focused on 
the statistical model that typically forms the basis of such rating systems. In this 
section we want to summarise the major issues that we have dealt with: 


e We have stressed the importance both, in terms of the quality of the resulting 
rating system and in terms of initial development costs of 
— The deployment of sophisticated software tools that automate the develop- 
ment process as much as possible and 
— The careful preparation and validation of the data that are employed. 
e External PDs form the basis of SRA type models. We have outlined some major 
issues that we found to be important in this context: 
— Which external rating types/agencies should be used? 
— Comparison between bank internal and external default definitions and con- 
sequences for resulting PD estimates 
— Sample construction for the estimation of external PDs (which time period, 
which obligor types?) 
— PD estimation techniques (cohort method vs. duration-based approach) 
— Point-in-time adjustment of external through-the-cycle ratings 
e In Sect. 4.3 we pointed out that different samples will be needed for different 
types of analysis and made a proposal for the construction of such samples. 
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In this context we also dealt with the issues of weighted and correlated 
observations. 
e Univariate risk factor analysis is the next development step. In Sect. 4.4 we have 
described the typical types of analysis required — measurement of a risk factor’s 
discriminatory power, transformation of risk factors, representativeness, fillers 
for missing values — and have mapped them to the samples on which they should 
be performed. 
e In Sect. 4.5 we dealt with multi factor modelling, in particular with 
— Model selection 
— The violation of model assumptions (non-normality, heteroscedasticity, error 
term correlations) 

— The measurement of risk factor influence (weights) 

— Manual adjustments of empirical estimates and calibration 

— A method to empirically validate the average influence of corporate groups 
or sovereign supporters on an obligor’s rating 

e Finally, in the same section, we gave a brief overview over the validation 
measures that we found most useful for a short-cut validation in the context of 
SRA rating development. 


While for most modelling steps one can observe the emergence of best practice 
tools, we think that in particular in the following areas further research is desirable 
to sharpen the instruments available for SRA rating development: 


e Data pooling in order to arrive at more confident estimates for adjustment factors 
of external PDs that account for the differences between bank internal and 
external default measurement 

e Empirical comparisons of the relative performance of cohort-based versus 
duration-based PD estimates and related confidence intervals 

e Point-in-time adjustments of external through-the-cycle ratings 

e Panel type correlation models for SRA samples and software implementations of 
these models 

e Measurement of risk factor influence (weights) 
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Chapter 5 
Estimating Probabilities of Default for Low 
Default Portfolios 


Katja Pluto and Dirk Tasche 


5.1 Introduction 


A core input to modern credit risk modelling and managing techniques is prob- 
abilities of default (PD) per borrower. As such, the accuracy of the PD estimations 
will determine the quality of the results of credit risk models. 

One of the obstacles connected with PD estimations can be the low number of 
defaults, especially in the higher rating grades. These good rating grades might 
enjoy many years without any defaults. Even if some defaults occur in a given year, 
the observed default rates might exhibit a high degree of volatility due to the 
relatively low number of borrowers in that grade. Even entire portfolios with low 
or zero defaults are not uncommon. Examples include portfolios with an overall 
good quality of borrowers (e.g. sovereign or bank portfolios) as well as high- 
exposure low-number portfolios (e.g. specialized lending). 

Usual banking practices for deriving PD values in such exposures often focus on 
qualitative mapping mechanisms to bank-wide master scales or external ratings. 
These practices, while widespread in the industry, do not entirely satisfy the desire 
for a statistical foundation of the assumed PD values. One might “believe” that the 
PDs per rating grade appear correct, as well as thinking that the ordinal ranking and 
the relative spread between the PDs of two grades is right, but find that there is 
insufficient information about the absolute PD figures. Lastly, it could be ques- 
tioned whether these rather qualitative methods of PD calibration fulfil the mini- 
mum requirements set out in BCBS (2004a). 
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This issue, amongst others, has recently been raised in BBA (2004). In that 
paper, applications of causal default models and of exogenous distribution assump- 
tions on the PDs across the grades have been proposed as solutions. Schuermann 
and Hanson (2004) present the “duration method” of estimating PDs by means of 
migration matrices (see also Jafry and Schuermann 2004). This way, nonzero PDs 
for high-quality rating grades can be estimated more precisely by both counting the 
borrower migrations through the lower grades to eventual default and using Markov 
chain properties. 

We present a methodology to estimate PDs for portfolios without any defaults, or 
a very low number of defaults in the overall portfolio. The proposal by Schuermann 
and Hanson (2004) does not provide a solution for such cases, because the duration 
method requires a certain number of defaults in at least some (usually the low- 
quality) rating grades. 

For estimating PDs, we use all available quantitative information of the rating 
system and its grades. Moreover, we assume that the ordinal borrower ranking is 
correct. We do not use any additional assumptions or information.' Our methodo- 
logy delivers confidence intervals for the PDs of each rating grade. The PD range 
can be adjusted by the choice of an appropriate confidence level. Moreover, by the 
most prudent estimation principle our methodology yields monotonic PD estimates. 
We look both at the cases of uncorrelated and correlated default events, in the latter 
case under assumptions consistent with the Basel risk weight model. 

Moreover, we extend the most prudent estimation by two application variants: 
First we scale our results to overall portfolio central tendencies. Second, we apply 
our methodology to multi-period data and extend our model by time dependencies 
of the Basel systematic factor. Both variants should help to align our principle to 
realistic data sets and to a range of assumptions that can be set according to the 
specific issues in question when applying our methodology. 

The paper is structured as follows: The two main concepts underlying the 
methodology — estimating PDs as upper confidence bounds and guaranteeing 
their monotony by the most prudent estimation principle — are introduced by two 
examples that assume independence of the default events. The first example deals 
with a portfolio without any observed defaults. For the second example, we modify 
the first example by assuming that a few defaults have been observed. In a further 
section, we show how the methodology can be modified in order to take into 
account non-zero correlation of default events. This is followed by two sections 
discussing extensions of our methodology, in particular the scaling to the overall 
portfolio central tendency and an extension of our model to the multi-period case. 
The last two sections are devoted to discussions of the scope of application and of 


! An important example of additional assumptions is provided by a-priori distributions of the PD 
parameters which lead to a Bayesian approach as described by Kiefer (2009). Interestingly enough, 
Dwyer (2006) shows that the confidence bound approach as described in this paper can be 
interpreted in a Bayesian manner. Another example of an additional assumption is presented in 
Tasche (2009). In that paper the monotonicity assumption on the PDs is replaced by a stronger 
assumption on the shape of the PD curve. 
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open questions. We conclude with a summary of our proposal. In Appendix A, 
we provide information on the numerics that is needed to implement the estima- 
tion approach we suggest. Appendix B provides additional numerical results to 
Sect. 5.5. 

We perceive that our “most prudent estimation principle” has been applied in a 
wide range of banks since the first edition of this book. However, application has 
not been limited to PD estimation, as intended by us. Rather, risk modellers seem to 
have made generous use of the methodology to validate their rating systems. We 
have therefore added another short section at the end of this paper that explains the 
sense and non-sense of using our principle for validation purposes, and clarify what 
the methodology can and cannot do. 


5.2 Example: No Defaults, Assumption of Independence 


The obligors are distributed to rating grades A, B, and C, with frequencies n4, ng, 
and nc. The grade with the highest credit-worthiness is denoted by A, the grade with 
the lowest credit-worthiness is denoted by C. No defaults occurred in A, B or C 
during the last observation period. 

We assume that the — still to be estimated — PDs p, of grade A, pg of grade B, and 
Pc Of grade C reflect the decreasing credit-worthiness of the grades, in the sense of 
the following inequality: 


Pa SPB < Pc (5.1) 


The inequality implies that we assume the ordinal borrower ranking to be 
correct. According to (5.1), the PD p, of grade A cannot be greater than the PD 
Pc of grade C. As a consequence, the most prudent estimate of the value p, is 
obtained under the assumption that the probabilities p4 and pc are equal. Then, from 
(5.1) even follows pa = pg = pc. Assuming this relation, we now proceed in 
determining a confidence region for pa at confidence level y. This confidence 
region” can be described as the set of all admissible values of p, with the property 
that the probability of not observing any default during the observation period is not 
less than 1—y (for instance for y = 90%). 

If we have got pa = pg = Pc, then the three rating grades A, B, and C do not 
differ in their respective riskiness. Hence we have to deal with a homogeneous 
sample of size ną + ng + Nc without any default during the observation period. 
Assuming unconditional independence of the default events, the probability of 


?For any value of pa not belonging to this region, the hypothesis that the true PD takes on this 
value would have to be rejected at a type I error level of 1-y (see Casella and Berger 2002, 
Theorem 9.2.2 on the duality of hypothesis testing and confidence intervals). 
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observing no defaults turns out to be (1 — p4)”**""*"°. Consequently, we have to 
solve the inequality 


1— y< (1— pa)“ tte (5.2) 


for p4 in order to obtain the confidence region at level y for p4 as the set of all the 
values of p, such that 


pZ i= (i= p er (5.3) 
If we choose for the sake of illustration 
na = 100, ng = 400, nc = 300, (5.4) 


Table 5.1 exhibits some values of confidence levels y with the corresponding 
maximum values (upper confidence bounds) p4 of pa such that (5.2) is still satisfied. 

According to Table 5.1, there is a strong dependence of the upper confidence 
bound p, on the confidence level y. Intuitively, values of y smaller than 95% seem 
more appropriate for estimating the PD by p4. 

By inequality (5.1), the PD ppg of grade B cannot be greater than the PD pc of 
grade C either. Consequently, the most prudent estimate of pg is obtained by 
assuming pg = pc. Assuming additional equality with the PD p, of the best 
grade A would violate the most prudent estimation principle, because p4 is a 
lower bound of pg. If we have got pg = pc, then B and C do not differ in their 
respective riskiness and may be considered a homogeneous sample of size ng + Nc. 
Therefore, the confidence region at level y for pg is obtained from the inequality 


1—7 < (1 — pe)” t: (5.5) 


(5.5) implies that the confidence region for pg consists of all the values of pg that 
satisfy 


Pe ee ee (1 _ y)1/(rtne) (5.6) 


If we again take up the example described by (5.4), Table 5.2 exhibits some 
values of confidence levels y with the corresponding maximum values (upper 
confidence bounds) pg of pg such that (5.6) is still fulfilled. 


Table 5.1 Upper confidence bound p4 of p; as a function of the confidence level y. No defaults 
observed, frequencies of obligors in grades given in (5.4) 

Yy 50% 75% 90% 95% 99% 99.9% 
PA 0.09% 0.17% 0.29% 0.37% 0.57% 0.86% 
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Table 5.2 Upper confidence bound jg of pg as a function of the confidence level y. No defaults 
observed, frequencies of obligors in grades given in (5.4) 

Y 50% 75% 90% 95% 99% 99.9% 
Pe 0.10% 0.20% 0.33% 0.43% 0.66% 0.98% 


Table 5.3 Upper confidence bound fic of pc as a function of the confidence level y. No defaults 
observed, frequencies of obligors in grades given in (5.4) 

Y 50% 75% 90% 95% 99% 99.9% 
Pc 0.23% 0.46% 0.76% 0.99% 1.52% 2.28% 


For determining the confidence region at level y for pc we only make use of the 
observations in grade C because by (5.1) there is no obvious upper bound for pc. 
Hence the confidence region at level y for pc consists of those values of pc that 
satisfy the inequality 


1—7 < (l—pe)* (5.7) 
Equivalently, the confidence region for pc can be described by 
pe <1- (1-7) (5.8) 


Coming back to our example (5.4), Table 5.3 lists some values of confidence 
levels y with the corresponding maximum values (upper confidence bounds) pc of 
pc such that (5.8) is still fulfilled. 

Comparison of Tables 5.1-5.3 shows that — besides the confidence level y — 
the applicable sample size is a main driver of the upper confidence bound. The 
smaller the sample size, the greater will be the upper confidence bound. This is not 
an undesirable effect, because intuitively the credit-worthiness ought to be the 
better, the greater the number of obligors in a portfolio without any default 
observation. 

As the results presented so far seem plausible, we suggest using upper confi- 
dence bounds as described by (5.3), (5.6) and (5.8) as estimates for the PDs in 
portfolios without observed defaults. The case of three rating grades we have 
considered in this section can readily be generalized to an arbitrary number of 
grades. We do not present the details here. 

However, the larger the number of obligors in the entire portfolio, the more often 
some defaults will occur in some grades at least, even if the general quality of the 
portfolio is very high. This case is not covered by (5.3), (5.6) and (5.8). In the 
following section, we will show — still keeping the assumption of independence of 
the default events — how the most prudent estimation methodology can be adapted 
to the case of a non-zero but still low number of defaults. 
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5.3 Example: Few Defaults, Assumption of Independence 


We consider again the portfolio from Sect. 5.2 with the frequencies n4, ng, and nc. 
In contrast to Sect. 5.2, this time we assume that during the last period no default 
was observed in grade A, two defaults were observed in grade B, and one default 
was observed in grade C. 

As in Sect. 5.2, we determine a most prudent confidence region for the PD p4 of 
A. Again, we do so by assuming that the PDs of the three grades are equal. This 
allows us to treat the entire portfolio as a homogeneous sample of size na + ng + 
nc. Then the probability of observing not more than three defaults is given by the 
expression 


3 
+ + i na+ng+nc—i 
Di a re Yaa — pan" (5.9) 


l 
i=0 


Expression (5.9) follows from the fact that the number of defaults in the portfolio 
is binomially distributed as long as the default events are independent. As a 
consequence of (5.9), the confidence region? at level y for p, is given as the set 
of all the values of p4 that satisfy the inequality 


3 

+ + i na+ng+nc—i 

-s5 (” 7 "Voki pa) meee (5.10) 
i=0 


The tail distribution of a binomial distribution can be expressed in terms of an 
appropriate beta distribution function. Thus, inequality (5.10) can be solved analyt- 
ically* for pa. For details, see Appendix A. If we assume again that the obligors’ 
numbers per grade are as in (5.4), Table 5.4 shows maximum solutions py, of (5.10) 
for different confidence levels y. 

Although in grade A no defaults were observed, the three defaults that occurred 
during the observation period enter the calculation. They affect the upper confi- 
dence bounds, which are much higher than those in Table 5.1. This is a consequence 
of the precautionary assumption pa = pg = pc. However, if we alternatively 
considered grade A alone (by re-evaluating (5.8) with n4 = 100 instead of nc), 
we would obtain an upper confidence bound of 1.38% at level y = 75%. This value 
is still much higher than the one that has been calculated under the precautionary 
assumption pa = pg = Pc — a consequence of the low frequency of obligors in 
grade A in this example. Nevertheless, we see that the methodology described by 
(5.10) yields fairly reasonable results. 


We calculate the simple and intuitive exact Clopper-Pearson interval. For an overview of this 
approach, as well as potential alternatives, see Brown et al. (2001). 

‘Alternatively, solving directly (5.10) for p4 by means of numerical tools is not too difficult either 
(see Appendix A, Proposition A.1, for additional information). 
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Table 5.4 Upper confidence bound pa of pa as a function of the confidence level y. No default 
observed in grade A, two defaults observed in grade B, one default observed in grade C, 
frequencies of obligors in grades given in (5.4) 


Y 50% 75% 90% 95% 99% 99.9% 
PA 0.46% 0.65% 0.83% 0.97% 1.25% 1.62% 


Table 5.5 Upper confidence bound pg of pz as a function of the confidence level y. No default 
observed in grade A, two defaults observed in grade B, one default observed in grade C, 
frequencies of obligors in grades given in (5.4) 

Y 50% 75% 90% 95% 99% 99.9% 
Pp 0.52% 0.73% 0.95% 1.10% 1.43% 1.85% 


In order to determine the confidence region at level y for pz, as in Sect. 5.2, we 
assume that pg takes its greatest possible value according to (5.1), i.e. that we have 
Ps = pc. In this situation, we have a homogeneous portfolio with ng + nc obligors, 
PD pp, and three observed defaults. Analogous to (5.9), the probability of observing 
no more than three defaults in one period then can be written as: 


3 ng +n i j 
> ( ak © \py(t = payer (5.11) 


l 
i=0 


Hence, the confidence region at level y for pg turns out to be the set of all the 
admissible values of pg which satisfy the inequality 


3 
+ i ng+nc—i 
boys); k i "e py TT (5.12) 


i=0 


By analytically or numerically solving (5.12) for pg — with frequencies of 
obligors in the grades as in (5.4) — we obtain Table 5.5 with some maximum 
solutions pg of (5.12) for different confidence levels y. 

From the given numbers of defaults in the different grades, it becomes clear that 
a stand-alone treatment of grade B would yield still much higher values? for the 
upper confidence bounds. The upper confidence bound 0.52% of the confidence 
region at level 50% is almost identical with the naive frequency based PD estimate 
2/400 = 0.5% that could alternatively have been calculated for grade B in this 
example. 

For determining the confidence region at level y for the PD pc, by the same 
rationale as in Sect. 5.2, the grade C must be considered a stand-alone portfolio. 
According to the assumption made in the beginning of this section, one default 


5At level 99.9%, e.g., 2.78% would be the value of the upper confidence bound. 
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Table 5.6 Upper confidence bound fc of pc as a function of the confidence level y. No default 
observed in grade A, two defaults observed in grade B, one default observed in grade C, 
frequencies of obligors in grades given in (5.4) 


y 50% 75% 90% 95% 99% 99.9% 
Pc 0.56% 0.90% 1.29% 1.57% 2.19% 3.04% 


occurred among the nc obligors in C. Hence we see that the confidence region for 
Pc is the set of all admissible values of pc that satisfy the inequality 


1 
i=96 y ee Ja = pe) = (1 — pe)" +nepe(1 = pc)" (5.13) 
i=0 


For obligor frequencies as assumed in example (5.4), Table 5.6 exhibits some 
maximum solutions® Dc of (5.13) for different confidence levels y. 

So far, we have described how to generalize the methodology from Sect. 5.2 to 
the case where non-zero default frequencies have been recorded. In the following 
section we investigate the impact of non-zero default correlation on the PD esti- 
mates that are effected by applying the most prudent estimation methodology. 


5.4 Example: Correlated Default Events 


In this section, we describe the dependence of the default events with the one-factor 
probit model’ that was the starting point for developing the risk weight functions 
given in BCBS (2004a)*. First, we use the example from Sect. 5.2 and assume that 
no default at all was observed in the whole portfolio during the last period. In order 
to illustrate the effects of correlation, we apply the minimum value of the asset 
correlation that appears in the Basel II corporate risk weight function. This mini- 
mum value is 12% (see BCBS 2004a, § 272). Our model, however, works with any 
other correlation assumption as well. Likewise, the most prudent estimation princi- 
ple could potentially be applied to other models than the Basel II type credit risk 
model as long as the inequalities can be solved for p4, pg and pc, respectively. 


°If we had assumed that two defaults occurred in grade B but no default was observed in grade C, 
then we would have obtained smaller upper bounds for pc than for pg. As this is not a desirable 
effect, a possible — conservative — work-around could be to increment the number of defaults in 
grade C up to the point where pc would take on a greater value than pg. Nevertheless, in this case 
one would have to make sure that the applied rating system yields indeed a correct ranking of the 
obligors. 

7 According to De Finetti’s theorem (see, e.g., Durrett (1996), Theorem 6.8), assuming one 
systematic factor only is not very restrictive. 

See Gordy (2003) and BCBS (2004b) for the background of the risk weight functions. In the case 
of non-zero realized default rates Balthazar (2004) uses the one-factor model for deriving 
confidence intervals of the PDs. 
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Table 5.7 Upper confidence bounds f4 of pa, Pg of pg and pc of pc as a function of the confidence 
level y. No defaults observed, frequencies of obligors in grades given in (5.4). Correlated default 
events 


y 50% 75% 90% 95% 99% 99.9% 
PA 0.15% 0.40% 0.86% 1.31% 2.65% 5.29% 
PB 0.17% 0.45% 0.96% 1.45% 2.92% 5.77% 
Pc 0.37% 0.92% 1.89% 2.78% 5.30% 9.84% 


Under the assumptions of this section, the confidence region at level y for pa 
is represented as the set of all admissible values of p, that satisfy the inequality 
(cf. Bluhm et al. 2003, Sects. 2.1.2 and 2.5.1 for the derivation) 


OO o-! (pa) _ ve) nNatngtnc 


where y and © stand for the standard normal density and standard normal distribu- 
tion function, respectively. @~' denotes the inverse function of ® and p is the asset 
correlation (here p is chosen as p = 12%). Similarly to (5.2), the right-hand side of 
inequality (5.14) tells us the one-period probability of not observing any default 
among na + na + na obligors with average PD p4. 

Solving’ (5.14) numerically'® for the frequencies as given in (5.4) leads to 
Table 5.7 with maximum solutions p4 of (5.14) for different confidence levels y. 

Comparing the values from the first line of Table 5.7 with Table 5.1 shows that 
the impact of taking care of correlations is moderate for the low confidence levels 
50% and 75%. The impact is much higher for the levels higher than 90% (for the 
confidence level 99.9% the bound is even six times larger). This observation reflects 
the general fact that introducing unidirectional stochastic dependence in a sum of 
random variables entails a redistribution of probability mass from the centre of the 
distribution towards its lower and upper limits. 

The formulae for the estimations of upper confidence bounds for pg and pc can 
be derived analogously to (5.14) [in combination with (5.5) and (5.7)]. This yields 
the inequalities 


O0 =] = ng+nc 
1-»< | y(t o(2 vp) | dy (5.15) 


°See Appendix A, Proposition A.2, for additional information. Taking into account correlations 
entails an increase in numerical complexity. Therefore, it might seem to be more efficient to deal 
with the correlation problem by choosing an appropriately enlarged confidence level in the 
independent default events approach as described in Sects. 5.2 and 5.3. However, it remains 
open how a confidence level for the uncorrelated case, that “appropriately” adjusts for the 
correlations, can be derived. 


'0The more intricate calculations for this paper were conducted by means of the software package 
R (cf. R Development Core Team 2003). 


84 K. Pluto and D. Tasche 


Table 5.8 Upper confidence bounds fa of pa, Pg of pg and pc of pc as a function of the confidence 
level y. No default observed in grade A, two defaults observed in grade B, one default observed in 
grade C, frequencies of obligors in grades given in (5.4). Correlated default events 


y 50% 75% 90% 95% 99% 99.9% 
DA 0.72% 1.42% 2.50% 3.42% 5.88% 10.08% 
PB 0.81% 1.59% 2.771% 3.77% 6.43% 10.92% 
Pc 0.84% 1.76% 3.19% 4.41% 7.68% 13.14% 
and 


l-y< Ne (0) (1 ee ) ) a (5.16) 


to be solved for pg and pc respectively. The numerical calculations with (5.15) and 
(5.16) do not deliver additional qualitative insights. For the sake of completeness, 
however, the maximum solutions pg of (5.15) and pc of (5.16) for different 
confidence levels y are listed in rows 3 and 4 of Table 5.7, respectively. 

Secondly, we apply our correlated model to the example from Sect. 5.3 and 
assume that three defaults were observed during the last period. Analogous to (5.9), 
(5.10) and (5.14), the confidence region at level y for p4 is represented as the set of 
all values of p, that satisfy the inequality 


1-7< [ p(y)z(y)dy, 


—0oo 


3 
na + ng + Nic i na+ng+nc—i 
20) =o ("TE oeno -6p pyi, BD 


i=0 


where the function G is defined by 


G(p, p,y) = o(a, (5.18) 


Solving (5.17) for pa with obligor frequencies as given in (5.4), and the respec- 
tive modified equations for pg and pc yields the results presented in Table 5.8. 

Not surprisingly, as shown in Table 5.8 the maximum solutions for p,4, Pg and fc 
increase if we introduce defaults in our example. Other than that, the results do not 
deliver essential additional insights. 


5.5 Extension: Calibration by Scaling Factors 


One of the drawbacks of the most prudent estimation principle is that in the case of 
few defaults, the upper confidence bound PD estimates for all grades are higher than 
the average default rate of the overall portfolio. This phenomenon is not surprising, 
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Table 5.9 Upper confidence bound Pa scaled Of PA, PB scaled Of pg and PC scaled Of Pc as a function of 
the confidence level y after scaling to the central tendency. No default observed in grade A, two 
defaults observed in grade B, one default observed in grade C, frequencies of obligors in grades 
given in (5.4). Uncorrelated default events 


y 50% 75% 90% 95% 99% 99.9% 
Central Tendency 0.375% 0.375% 0.375% 0.375% 0.375% 0.375% 
K 0.71 0.48 0.35 0.30 0.22 0.17 

Pa 0.33% 0.31% 0.29% 0.29% 0.28% 0.27% 
Pp 0.37% 0.35% 0.34% 0.33% 0.32% 0.31% 
Pc 0.40% 0.43% 0.46% 0.47% 0.49% 0.50% 


Table 5.10 Upper confidence bound P4 scaled Of PA, PB scaled Of Pg and Pc scaled Of Pc as a function of 
the confidence level y after scaling to the central tendency. No default observed in grade A, two 
defaults observed in grade B, one default observed in grade C, frequencies of obligors in grades 
given in (5.4). Correlated default events 


Y 50% 75% 90% 95% 99% 99.9% 
Central Tendency 0.375% 0.375% 0.375% 0.375% 0.375% 0.375% 
K 0.46 0.23 0.13 0.09 0.05 0.03 
Da 0.33% 0.33% 0.32% 0.32% 0.32% 0.32% 
Pe 0.38% 0.37% 0.36% 0.36% 0.35% 0.35% 
Pc 0.39% 0.40% 0.41% 0.42% 0.42% 0.42% 


given that we include all defaults of the overall portfolio in the upper confidence 
bound estimation even for the highest rating grade. However, these estimates might 
be regarded as too conservative by some practitioners. 

A remedy would be a scaling’! of all of our estimates towards the central 
tendency (the average portfolio default rate). We introduce a scaling factor K to 
our estimates such that the overall portfolio default rate is exactly met, i.e. 


Pana + Peng + Ponc 
na + ng + nc 


K = PDpPortfolio- (5.19) 


The new, scaled PD estimates will then be 
Px.scalea = Kpxy, X = A,B,C. (5.20) 


The results of the application of such a scaling factor to our “few defaults” 
examples of Sects. 5.3 and 5.4 are shown in Tables 5.9 and 5.10, respectively. 

The average estimated portfolio PD will now fit exactly the overall portfolio 
central tendency. Thus, we remove all conservatism from our estimations. Given the 
poor default data base in typical applications of our methodology, this might be seen 
as a disadvantage rather than an advantage. By using the most prudent estimation 


"TA similar scaling procedure was suggested by Benjamin et al. (2006). However, the straight- 
forward linear approach as described in (5.19) and (5.20) has the drawback that, in principle, the 
resulting PDs can exceed 100%. See Tasche (2009, Appendix A) for a non-linear scaling approach 
based on Bayes’ formula that avoids this issue. 
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principle to derive “relative” PDs before scaling them down to the final results, we 
preserve the sole dependence of the PD estimates upon the borrower frequencies in 
the respective rating grades, as well as the monotony of the PDs. 

The question of the appropriate confidence level for the above calculations 
remains. Although the average estimated portfolio PD now always fits the overall 
portfolio default rate, the confidence level determines the “distribution” of that rate 
over the rating grades. In the above example, though, the differences in distribution 
appear small, especially in the correlated case, such that we would not explore this 
issue further. The confidence level could, in practice, be used to control the spread 
of PD estimates over the rating grades — the higher the confidence level, the higher 
the spread. 

However, the above scaling works only if there is a nonzero number of defaults 
in the overall portfolio. Zero default portfolios would indeed be treated more 
severely if we continued to apply our original proposal to them, compared to using 
scaled PDs for low default portfolios. 

A variant of the above scaling proposal that takes care of both issues is the use of 
an upper confidence bound for the overall portfolio PD in lieu of the actual default 
rate. This upper confidence bound for the overall portfolio PD, incidentally, equals 
the most prudent estimate for the highest rating grade. Then, the same scaling 
methodology as described above can be applied. The results of its application to the 
few defaults examples as in Tables 5.9 and 5.10 are presented in Tables 5.11 
and 5.12. 


Table 5.11 Upper confidence bound P4 scaled Of PA, PB scaled Of Pg and Pc scaled Of Pc as a function of 
the confidence level y after scaling to the upper confidence bound of the overall portfolio PD. No 
default observed in grade A, two defaults observed in grade B, one default observed in grade C, 
frequencies of obligors in grades given in (5.4). Uncorrelated default events 


y 50% 75% 90% 95% 99% 99.9% 

Upper bound for 0.46% 0.65% 0.83% 0.97% 1.25% 1.62% 
portfolio PD 

K 0.87 0.83 0.78 0.77 0.74 0.71 

Pa 0.40% 0.54% 0.65% 0.74% 0.92% 1.16% 

PB 0.45% 0.61% 0.74% 0.84% 1.06% 1.32% 

Pc 0.49% 0.75% 1.01% 1.22% 1.62% 2.17% 


Table 5.12 Upper confidence bound P4 scaled Of PA, PB scaled Of Pg and Pc scaled Of Pc as a function of 
the confidence level y after scaling to the upper confidence bound of the overall portfolio PD. No 
default observed in grade A, two defaults observed in grade B, one default observed in grade C, 
frequencies of obligors in grades given in (5.4). Correlated default events 


Y 50% 75% 90% 95% 99% 99.9% 

Upper bound for 0.71% 1.42% 2.50% 3.42% 5.88% 10.08% 
portfolio PD 

K 0.89 0.87 0.86 0.86 0.86 0.87 

Pa 0.64% 1.24% 2.16% 2.95% 5.06% 8.72% 

PB 0.72% 1.38% 2.39% 3.25% 5.54% 9.54% 


Pc 0.75% 1.53% 2.76% 3.80% 6.61% 11.37% 
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In contrast to the situation of Tables 5.9 and 5.10, in Tables 5.11 and 5.12 the 
overall default rate in the portfolio depends on the confidence level, and we observe 
scaled PD estimates for the grades that increase with growing levels. Nevertheless, 
the scaled PD estimates for the better grades are still considerably lower than the 
corresponding unscaled estimates from Sects. 5.3 and 5.4, respectively. For the sake 
of comparison, we provide in Appendix B the analogous numerical results for the 
no default case. 

The advantage of this latter variant of the scaling approach is that the degree of 
conservatism is actively manageable by the appropriate choice of the confidence 
level for the estimation of the upper confidence bound of the portfolio PD. More- 
over, it works in the case of zero defaults and few defaults, and thus does not 
produce a structural break between both scenarios. Lastly, the results are less 
conservative than those of our original methodology. 


5.6 Extension: The Multi-period Case 


So far, we have only considered the situation where estimations are carried out on a 
1 year (or one observation period) data sample. In case of a time series with data 
from several years, the PDs (per rating grade) for the single years could be 
estimated and could then be used for calculating weighted averages of the PDs in 
order to make more efficient use of the data. By doing so, however, the interpreta- 
tion of the estimates as upper confidence bounds at some pre-defined level would 
be lost. 

Alternatively, the data of all years could be pooled and tackled as in the 1-year 
case. When assuming cross-sectional and inter-temporal independence of the 
default events, the methodology as presented in Sects. 5.2 and 5.3 can be applied 
to the data pool by replacing the 1-year frequency of a grade with the sum of the 
frequencies of this grade over the years (analogous for the numbers of defaulted 
obligors). This way, the interpretation of the results as upper confidence bounds as 
well as the frequency-dependent degree of conservatism of the estimates will be 
preserved. 

However, when turning to the case of default events which are cross-sectionally 
and inter-temporally correlated, pooling does not allow for an adequate modelling. 
An example would be a portfolio of long-term loans, where in the inter-temporal 
pool every obligor would appear several times. As a consequence, the dependence 
structure of the pool would have to be specified very carefully, as the structure of 
correlation over time and of cross-sectional correlation are likely to differ. 

In this section, we present two multi-period extensions of the cross-sectional 
one-factor correlation model that has been introduced in Sect. 5.4. In the first part of 
the section, we take the perspective of an observer of a cohort of obligors over a 
fixed interval of time. The advantage of such a view arises from the conceptual 
separation of time and cross-section effects. Again, we do not present the metho- 
dology in full generality but rather introduce it by way of an example. 
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As in Sect. 5.4, we assume that, at the beginning of the observation period, we 
have got n, obligors in grade A, ng obligors in grade B, and nc obligors in grade C. 
In contrast to Sect. 5.4, the length of the observation period this time is T >1. We 
consider only the obligors that were present at the beginning of the observation 
period. Any obligors entering the portfolio afterwards are neglected for the purpose 
of our estimation exercise. Nevertheless, the number of observed obligors may vary 
from year to year as soon as any defaults occur. 

As in the previous sections, we first consider the estimation of the PD p, for 
grade A. PD in this section denotes a long-term average 1-year probability of 
default. Working again with the most prudent estimation principle, we assume 
that the PDs p4, pg, and pc are equal, i.e. pa = pg = Pc = p. We assume, similar 
to Gordy (2003), that a default of obligor i = 1, ..., N =n, + ng + nc in year 
t = 1,..., Tis triggered if the change in value of their assets results in a value lower 
than some default threshold c as described below by (5.22). Specifically, if V; , denotes 
the change in value of obligor 7’s assets, V;, is given by 


Vir = VPS + V1 — p Sits (5.21) 


where p stands for the asset correlation as introduced in Sect. 5.4, S, is the 
realisation of the systematic factor in year t, and €;, denotes the idiosyncratic 
component of the change in value. The cross-sectional dependence of the default 
events stems from the presence of the systematic factor S, in all the obligors’ change 
in value variables. Obligor 7’s default occurs in year t if 


Vir>c,..., Vir i>, Vit Se. (5.22) 
The probability 
PIVit < c] = Pi =P (5.23) 


is the parameter we are interested to estimate: It describes the long-term average 
1-year probability of default among the obligors that have not defaulted before. The 
indices i and ¢ at p;, can be dropped because by the assumptions we are going to 
specify below p;, will neither depend oni nor on t. To some extent, therefore, p may 
be considered a through-the-cycle PD. 

For the sake of computational feasibility, and in order to keep as close as possible 
to the Basel II risk weight model, we specify the factor variables S, t = 1,...,7, and 
Čini = 1,...,N,t = 1,...,T as standard normally distributed (cf. Bluhm et al. 2003). 
Moreover, we assume that the random vector (Sj,. . .,S7) and the random variables 
ĉin i= 1,...N, t= 1,...,T are independent. As a consequence, from (5.21) it 
follows that the change in value variables V; , are all standard-normally distributed. 
Therefore, (5.23) implies that the default threshold’? c is determined by 


12 At first sight, the fact that in our model the default threshold is constant over time seems to imply 
that the model does not reflect the possibility of rating migrations. However, by construction of the 
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c= '(p), (5.24) 


with ® denoting the standard normal distribution function. 

While the single components S, of the vector of systematic factors, generate the 
cross-sectional correlation of the default events at time ¢, their inter-temporal 
correlation is affected by the dependence structure of the factors Sj,...,S7. We 
further assume that not only the components but also the vector as a whole is 
normally distributed. Since the components of the vector are standardized, its joint 
distribution is completely determined by the correlation matrix 


1 r2 n3 oc: Thy 
ra 1l m3 œ= Par 
(5.25) 
TT rrr- l 


Whereas the cross-sectional correlation within 1 year is constant for any pair of 
obligors, empirical observation indicates that the effect of inter-temporal correla- 
tion becomes weaker with increasing distance in time. We express this distance- 
dependent behaviour”? of correlations by setting in (5.25) 


rsp = 0, s tH=1,-+:,T, s Ft, (5.26) 


for some appropriate 0 < J < 1 to be specified below. 

Let us assume that within the T years observation period k, defaults were 
observed among the obligors that were initially graded A, kg defaults among the 
initially graded B obligors and kc defaults among the initially graded C obligors. 
For the estimation of p, according to the most prudent estimation principle, 
therefore we have to take into account k = ką + kg + kc defaults among N 
obligors over T years. For any given confidence level y, we have to determine the 
maximum value p of all the parameters p such that the inequality 


1 — y < P[No more than k defaults observed] (5.27) 


is satisfied — note that the right-hand side of (5.27) depends on the one-period 
probability of default p. In order to derive a formulation that is accessible to 
numerical calculation, we have to rewrite the right-hand side of (5.27). 


model, the conditional default threshold at time t given the value V;,; will in general differ from c. 
As we make use of the joint distribution of the V;,, therefore rating migrations are implicitly taken 
into account. 

'3Blochwitz et al. (2004) proposed the specification of the inter-temporal dependence structure 
according to (5.26) for the purpose of default probability estimation. 
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The first step is to develop an expression for obligor i’s conditional probability to 
default during the observation period, given a realization of the systematic factors 
Sj,...,97. From (5.21), (5.22), (5.24) and by using the conditional independence of 
the V;;, ..., Vir given the systematic factors, we obtain 


P[Obligor i defaults |S,--+ Sy] 


D~ (p) — VPS 
Se 


=1- [| [0 - Gp, p,S)), 


t=1 


= -P|én> 


where the function G is defined as in (5.18). By construction, in the model all the 
probabilities P[Obligor 7 defaults |S,, ..., Sr] are equal, so that, for any of the i, we 
can define 


n(Sj,...,Sr) = P[Obligor i defaults | S1, ..., S7] 
L (5.29) 
= 1- [[ (1 - Gip, p,s:)) 


t=1 
Using this abbreviation, we can write the right-hand side of (5.27) as 
P[No more than k defaults observed] 


k 
= E[P[Exactly / obligors default | $1, ..., S 
2 ee Sasa (5.30) 


k 


(7 Ets, vt) (1—2(S), ..., Sr). 


/=0 


The expectations in (5.30) are expectations with respect to the random vector 
(Si, ...,S7) and have to be calculated as 7-dimensional integrals involving the 
density of the 7-variate standard normal distribution with correlation matrix 
given by (5.25) and (5.26). When solving (5.27) for p, we calculated the values of 
these T-dimensional integrals by means of Monte-Carlo simulation, taking advan- 
tage of the fact that the term 


k 
5 (7 E6 a Sr} StS pacSe | (5.31) 


/=0 


can be efficiently evaluated by making use of (5.35) of Appendix A. 
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In order to present some numerical results for an illustration of how the model 
works, we have to fix a time horizon T and values for the cross-sectional correlation 
p and the inter-temporal correlation parameter V. We choose T = 5 as BCBS 
(2004a) requires the credit institutions to base their PD estimates on a time series 
with minimum length 5 years. For p, we chose p = 0.12 as in Sect. 5.4, i.e. again a 
value suggested by BCBS (2004a). Our feeling is that default events with a 5 years 
time distance can be regarded as being nearly independent. Statistically, this 
statement might be interpreted as something like “the correlation of S; and S5 is 
less than 1%”. Setting V = 0.3, we obtain corr[S},. . Sr] = 9 = 0.81%. Thus, the 
choice 7 = 0.3 seems reasonable. Note that our choices of the parameters are 
purely exemplary, as to some extent choosing the values of the parameters is rather 
a matter of taste or judgement or of decisions depending on the available data or the 
purpose of the estimations.“ 

Table 5.13 shows the results of the calculations for the case where no defaults were 
observed during 5 years in the whole portfolio. The results for all the three grades are 
summarized in one table. To arrive at these results, (5.27) was first evaluated with 
N =n, + ng + ne, then with N = ng + ne, and finally with N = nc. In all three 
cases we set k = 0 in (5.30) in order to express that no defaults were observed. Not 
surprisingly, the calculated confidence bounds are much lower than those presented as 
in Table 5.7, thus demonstrating the potentially dramatic effect of exploiting longer 
observation periods. 

For Table 5.14 we conducted essentially the same computations as for 
Table 5.13, the difference being that we assumed that over 5 years ką = 0, defaults 
were observed in grade A, kg =2 defaults were observed in grade B, and kc = 1 


Table 5.13 Upper confidence bounds pa of pa, pg of pg and pc of pc as a function of the 
confidence level y. No defaults during 5 years observed, frequencies of obligors in grades given in 
(5.4). Cross-sectionally and inter-temporally correlated default events 


Y 50% 75% 90% 95% 99% 99.9% 
Pa 0.03% 0.06% 0.11% 0.16% 0.30% 0.55% 
PB 0.03% 0.07% 0.13% 0.18% 0.33% 0.62% 
Pc 0.07% 0.14% 0.26% 0.37% 0.67% 1.23% 


Table 5.14 Upper confidence bounds pa of pa, pg of pg and pc of pc as a function of the 
confidence level y. During 5 years, no default observed in grade A, two defaults observed in grade 
B, one default observed in grade C, frequencies of obligors in grades given in (5.4). Cross- 
sectionally and inter-temporally correlated default events 


Yy 50% 75% 90% 95% 99% 99.9% 
Pa 0.12% 0.21% 0.33% 0.43% 0.70% 1.17% 
Pp 0.14% 0.24% 0.38% 0.49% 0.77% 1.29% 
Pc 0.15% 0.27% 0.46% 0.61% 1.01% 1.70% 


"Benjamin et al. (2006) propose a similar methodology that pools multi-year data into one large 
pool of customers. Effectively, they implicitly assume identical cross-borrower and intra-temporal 
correlations and disregard borrower duplication within the observation period. 
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defaults were observed in grade C (as in Sects. 5.3 and 5.4 during 1 year). 
Consequently, we set k = 3 in (5.30) for calculating the upper confidence bounds 
for pa and pg, as well as k = 1 for the upper confidence bounds of pc. Compared 
with the results presented in Table 5.8, we observe again the very strong effect of 
taking into account a longer time series. 

The methodology described above could be christened “cohort approach” — as 
cohorts of borrowers are observed over multiple years. It does not take into account 
any changes in portfolio size due to new lending or repayment of loans. Moreover, 
the approach ignores the information provided by time clusters of defaults (if there 
are any). Intuitively, time-clustering of defaults should be the kind of information 
needed to estimate the cross-sectional and time-related correlation parameters p 
and 9 respectively’. 

A slightly different multi-period approach (called “multiple binomial” in the 
following) allows for variation of portfolio size by new lending and amortization 
and makes it possible, in principle, to estimate the correlation parameters. In 
particular this approach ignores the fact that most of the time the portfolio compo- 
sition this year and next year is almost identical. However, it will turn out that as a 
consequence of the conditional independence assumptions we have adopted the 
impact of ignoring the almost constant portfolio composition is reasonably weak. 

Assume that the portfolio size in year t was N, for t = 1, ..., T, and that d, 
defaults were observed in year t. Given realisations S7, ..., Sr of the systematic 
factors, we then assume that the distribution of the number of defaults in year t 
conditional on $7, ..., Sr is binomial as in (5.17) and (5.18), i.e. 


P{d, defaults in year ¢|S,,..., Sz] 


N, B 
= (7 ev. p,S)*(1 — Clp, p, S) (5.32) 


Under the additional assumption of conditional independence of default events 
at different moments in time conditional on a realisation of the systematic factors, 
(5.32) implies that the unconditional probability to observe d; defaults in year 
1, ..., dr defaults in year T is given by 


Pid, defaults in year t, t= 1,...,T] 
= E/P|d, defaults in year t, t = 1,...,T|S,,...,S7]] 


T/N: E 
E II ( ; )e0.0.5,)%( — G(p, p, S) (5.33) 


'SIndeed, it is possible to modify the cohort approach in such a way as to take account of portfolio 
size varying due to other causes than default and of time-clusters of default. This modification, 
however, comes at a high price because it requires a much more complicated input data structure 
that causes much longer calculation time. 
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As (5.33) involves a binomial distribution for each point in time f we call the 
approach the “multiple binomial” approach. If we assume that the latent systematic 
factors follow a T-dimensional normal distribution with standard normal marginals 
as specified by (5.25) and (5.26), then calculation of the right-hand side of (5.33) 
involves the evaluation of a T-dimensional integral. This can be done by Monte- 
Carlo simulation as in the case of (5.31). 

By means of an appropriate optimisation method’, the right-hand side of (5.33) 
can be used as the likelihood function for the determination of joint maximum 
likelihood estimates of the correlation parameters p and V and of the long-run PD 
parameter p. It however requires at least one of the annual default number observa- 
tions d, to be positive. Otherwise the likelihood (5.33) is constant equal to 100% for 
p = Oand it is not possible to identify unique parameters p and V) that maximise the 
likelihood. In the context of Table 5.14, if we assume that the three defaults 
occurred in the first year and consider the entire portfolio, the maximum likelihood 
estimates of p, V and p are 34.3%, 0%, and 7.5 bps respectively. 

In the case where values of the correlation parameters are known or assumed to 
be known, it is also possible to use the multiple binomial approach to compute 
confidence bound type estimates of the long-run grade-wise PD estimates as was 
done for Table 5.14. To be able to do this calculation with the multiple binomial 
approach, we need to calculate the unconditional probability that the total number 
of defaults in years 1 to T does not exceed d = d; +---+ dr. As the sum of 
binomially distributed random variables with different success probabilities in 
general is not binomially distributed, we calculate an approximate value of the 
required unconditional probability based on Poisson approximation: 


P[No more than d defaults in years 1 to T] 


d k 
Ip p(Si,---, Sr) 
~ E}exp(—I,)(Si,-..,Sr)) peer ema ; (5.34) 
k=0 . 


T 
Ip p(S1, ers ST) = SoM, G(p, p, Si). 
t=1 


The expected value in (5.34) again has to be calculated by Monte-Carlo simula- 
tion. Table 5.15 shows the results of such a calculation in the context of Table 5.13 
[i.e. Table 5.13 is recalculated based on (5.34) instead of (5.31)]. 

Similarly, Table 5.16 displays the recalculated Table 5.14 [i.e. Table 5.14 is 
recalculated based on (5.34) instead of (5.31)]. Both in Table 5.15 and Table 5.16 
results seem hardly different to the results in Table 5.13 and Table 5.14 respec- 
tively. Hence the use of (5.34) instead of (5.31) in order to allow for different 
portfolio sizes due to new lending and amortisation appears to be justified. 


‘For the numerical examples in this paper, the authors made use of the R-procedure nlminb. 
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Table 5.15 Upper confidence bounds pa of pa, pg of pg and pc of pc as a function of the 
confidence level y. No defaults during 5 years observed, frequencies of obligors in grades given in 
(5.4). Cross-sectionally and inter-temporally correlated default events. Calculation based on (5.34) 


Tr 50% 75% 90% 95% 99% 99.9% 
Pa 0.02% 0.05% 0.10% 0.15% 0.29% 0.53% 
PB 0.03% 0.06% 0.12% 0.17% 0.32% 0.60% 
Pc 0.06% 0.13% 0.26% 0.36% 0.66% 1.19% 


Table 5.16 Upper confidence bounds p, of pa, Pg of pg and pc of pc as a function of the 
confidence level y. During 5 years, no default observed in grade A, two defaults observed in grade 
B, one default observed in grade C, frequencies of obligors in grades given in (5.4). Cross- 
sectionally and inter-temporally correlated default events. Calculation based on (5.34) 


y 50% 75% 90% 95% 99% 99.9% 
PA 0.12% 0.21% 0.33% 0.42% 0.68% 1.12% 
Pp 0.13% 0.23% 0.37% 0.47% 0.76% 1.24% 
Pc 0.14% 0.26% 0.44% 0.59% 0.99% 1.66% 


5.7 Applications 


The most prudent estimation methodology described in the previous sections can be 
used for a range of applications, both within a bank and in a Basel II context. In the 
latter case, it might be specifically useful for portfolios where neither internal nor 
external default data are sufficient to meet the Basel requirements. A good example 
might be Specialized Lending. In these high-volume, low-number and low-default 
portfolios, internal data is often insufficient for PD estimations per rating category, 
and might indeed even be insufficient for central tendency estimations for the entire 
portfolio (across all rating grades). Moreover, mapping to external ratings — 
although explicitly allowed in the Basel context and widely used in bank internal 
applications — might be impossible due to the low number of externally rated 
exposures. 

The (conservative) principle of the most prudent estimation could serve as an 
alternative to the Basel slotting approach, subject to supervisory approval. In this 
context, the proposed methodology might be interpreted as a specific form of the 
Basel requirement of conservative estimations if data is scarce. 

In a wider context, within the bank, the methodology might be used for all sorts 
of low default portfolios. In particular, it could complement other estimation 
methods, whether this be mapping to external ratings, the proposals by Schuermann 
and Hanson (2004) or others. As such, we see our proposed methodology as an 
additional source for PD calibrations. This should neither invalidate nor prejudge a 
bank’s internal choice of calibration methodologies. 

However, we tend to believe that our proposed methodology should only be 
applied to whole rating systems and portfolios. One might think of calibrating PDs 
of individual low default rating grades within an otherwise rich data structure. 
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Doing so almost unavoidably leads to a structural break between average PDs 
(data rich rating grades) and upper PD bounds (low default rating grades) which 
makes the procedure appear infeasible. Similarly, we believe that the application 
of the methodology for backtesting or similar validation tools would not add 
much additional information. For instance, purely expert-based average PDs per 
rating grade would normally be well below our proposed quantitative upper 
bounds. 


5.8 Open Issues 


For applications, a number of important issues need to be addressed: 


e Which confidence levels are appropriate? The proposed most prudent estimate 
could serve as a conservative proxy for average PDs. In determining the confi- 
dence level, the impact of a potential underestimation of these average PDs 
should be taken into account. One might think that the transformation of average 
PDs into some kind of “stress” PDs, as done in the Basel II and many other credit 
risk models, could justify rather low confidence levels for the PD estimation in 
the first place (i.e. using the models as providers of additional buffers against 
uncertainty). However, this conclusion would be misleading, as it mixes two 
different types of “stresses”: the Basel II model “stress” of the single systematic 
factor over time, and the estimation uncertainty “stress” of the PD estimations. 
Indeed, we would argue for moderate confidence levels when applying the most 
prudent estimation principle, but for other reasons. The most common alterna- 
tive to our methodology, namely deriving PDs from averages of historical 
default rates per rating grade, yields a comparable probability that the true PD 
will be underestimated. Therefore, high confidence levels in our methodology 
would be hard to justify. 

e At which number of defaults should users deviate from our methodology and use 
“normal” average PD estimation methods, at least for the overall portfolio 
central tendency? Can this critical number be analytically determined? 

e If the relative number of defaults in one of the better ratings grades is signifi- 
cantly higher than those in lower rating grades (and within low default portfo- 
lios, this might happen with only one or two additional defaults), then our PD 
estimates may turn out to be non-monotonic. In which cases should this be 
taken as an indication of an incorrect ordinal ranking? Certainly, monotony or 
non-monotony of our upper PD bounds does not immediately imply that the 
average PDs are monotonic or non-monotonic. Under which conditions would 
there be statistical evidence of a violation of the monotony requirement for 
the PDs? 


Currently, we do not have definite solutions to above issues. We believe, though, 
that some of them will involve a certain amount of expert judgment rather than 
analytical solutions. In particular, that might be the case with the first item. If our 
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proposed approach were used in a supervisory — say Basel II — context, supervisors 
might want to think about suitable confidence levels that should be consistently 
applied. 


5.9 Estimation Versus Validation 


We have been somewhat surprised to see the methodology described in this chapter 
being often applied for PD validation rather than PD estimation. This new section 
for the second edition of the book sets out principles as to when and when not apply 
the methodology for PD estimation, as well as examples where application might be 
useful in practice. 

First, the low default estimation methodology based on upper confidence bounds 
has a high degree of inbuilt conservatism. Comparing default rates or PDs estimated 
by other methodologies against confidence-bound-based PDs must take this esti- 
mation bias into account — having observed default rates not breaching our upper 
confidence bounds should not be regarded as a particular achievement, and observ- 
ing default rates above the confidence bounds may indicate a serious PD under- 
estimation indeed. 

Second, spreading the central tendency of a portfolio across rating grades via the 
most prudent estimation principle has the grade PDs, in effect, solely driven by 
grade population and the confidence level. There are limits as to how wide the 
central tendency can be statistically spread, implying that the slope of the most 
prudent PDs over rating grades tends to be much flatter than PDs curves derived by 
alternative methods (e.g. benchmarking to external ratings). 

So which benefits can be derived from validation via benchmarking against low 
default estimates based on upper confidence bounds? As the low default methodol- 
ogy delivers conservative PD estimates, it can offer some insight into the degree of 
conservatism for PDs calibrated by another method. 

For a given PD estimate (derived, for example, by benchmarking to external 
ratings) and an observed number of defaults, an intermediate step of the calculation 
of upper confidence bounds gives an implied confidence level that would have 
delivered the same PD from the default rate via the confidence bound calculation. 
Indeed, using (5.27) with the given PD estimate to determine y generates an implied 
confidence level as desired. 

While there is no test as to which confidence level is “too conservative” in this 
context, the approach offers an opportunity for the quantification of conservatism 
that might be helpful in bank internal and regulatory discussions. The approach is 
most useful for central tendency comparisons — application at grade level may 
result in very different confidence levels across the rating scale due to the low 
number of defaults. The interpretation of such fluctuating levels then becomes 
somewhat of a challenge. The approach might yield useful results over time, 
however, as the implicit confidence level changes. The volatility can give some 
qualitative indication as to how much “point in time” or “through the cycle” a rating 
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system is — the latter should result in higher volatility as observed default rates are 
always point in time. 


5.10 Conclusions 


In this article, we have introduced a methodology for estimating probabilities of 
default in low or no default portfolios. The methodology is based on upper 
confidence intervals by use of the most prudent estimation. Our methodology 
uses all available quantitative information. In the extreme case of no defaults in 
the entire portfolio, this information consists solely of the absolute numbers of 
counter-parties per rating grade. 

The lack of defaults in the entire portfolio prevents reliable quantitative state- 
ments on both the absolute level of average PDs per rating grade as well as on the 
relative risk increase from rating grade to rating grade. Within the most prudent 
estimation methodology, we do not use such information. The only additional 
assumption used is the ordinal ranking of the borrowers, which is assumed to be 
correct. 

Our PD estimates might seem rather high at first sight. However, given the 
amount of information that is actually available, the results do not appear out of 
range. We believe that the choice of moderate confidence levels is appropriate 
within most applications. The results can be scaled to any appropriate central 
tendency. Additionally, the multi-year context as described in Sect. 5.6 might 
provide further insight. 


Appendix A 


This appendix provides additional information on the analytical and numerical 
solutions of (5.10) and (5.14). 

Analytical solution of (5.10). If X is a binomially distributed random variable 
with size parameter n and success probability p, then for any integer 0 < k < n, we 
have 


(5.35) 


with Y denoting a beta distributed random variable with parameters « = k + J and 
P = n—k (see, e.g., Hinderer (1980), Lemma 11.2). The beta distribution function 
and its inverse function are available in standard numerical tools, e.g. in Excel. 

Direct numerical solution of Equation (5.10). The following proposition shows 
the existence and uniqueness of the solution of (5.10), and, at the same time, 
provides initial values for the numerical root-finding [see (5.38)]. 
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Proposition A.1. Let O< k <n be integers, and define the function fax: 
(0, 1) — R by 


k 
n i n-i 
fuse) = (a-pe 0.1) (530) 
Fix some 0 < v < 1. Then the equation 

fnk(P) =v (5.37) 


has exactly one solution 0 < p = p(v) < 1. Moreover, this solution p(v) satisfies 
the inequalities 


1- Wv < plv) < V1-v (5.38) 
Proof. A straight-forward calculation yields 
dfax(p) — n k n—k-1 
<a Ae g POP) . (5.39) 


Hence fa is strictly decreasing. This implies uniqueness of the solution of 
(5.37). The inequalities 


faolp) < farlp) < Jan- (p) (5.40) 


imply the existence of a solution of (5.37) and the inequalities (5.38). 

Numerical solution of (5.14). For (5.14) we can derive a result similar to 
Proposition A.1. However, there is no obvious upper bound to the solution p(v) of 
(5.42) as in (5.38). 

Proposition A.2. For any probability 0 < p < 1, any correlation 0 < p < 1 
and any real number y define 


ryp») = 0(— VP), (5.41) 


where we make use of the same notations as for (5.14). Fix a value 0 < v < landa 
positive integer n. Then the equation 


v=] e0)1-Fr.y))"a, (5.42) 


—oo 


with p denoting the standard normal density, has exactly one solution 0 < p = p 
(v) < 1. This solution p(v) satisfies the inequality 


pv) 1-H. (5.43) 
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Proof of Proposition (A.2) Note that — for fixed p and y — the function F’,(p, y) is 
strictly increasing and continuous in p. Moreover, we have 


0= lim F'p(p, y) and 1 = lim F; (p, y) (5.44) 
p> p> 


Equation (5.44) implies existence and uniqueness of the solution of (5.42). 
Define the random variable Z by 


Z=F,(p,Y), (5.45) 


where Y denotes a standard normally distributed random variable. Then Z has the 
well-known Vasicek distribution (cf. Vasicek 1997), and in particular we have 


E[Z] = p. (5.46) 
Using (5.45), (5.42) can be rewritten as 
v= E|(1 —Z)"). (5.47) 
Since y — (1— y)” is convex for 0 < y < 1, by (5.46) Jensen’s inequality 
implies 
v =E[(1—Z)"] > (1 — p)". (5.48) 


As the right-hand side of (5.42) is decreasing in p, (5.43) now follows from 
(5.48). 


Appendix B 


This appendix provides additional numerical results for the “scaling” extension of 
the most prudent estimation principle according to Sect. 5.5 in the case of no default 
portfolios. In the examples presented in Tables 5.17 and 5.18, the confidence level 
for deriving the upper confidence bound for the overall portfolio PD, and the 
confidence levels for the most prudent estimates of PDs per rating grade have 


Table 5.17 Upper confidence bound P4 scaled Of PA, PB scaled Of Pg and Pc scaled Of pc as a function of 
the confidence level y after scaling to the upper confidence bound of the overall portfolio PD. No 
default observed, frequencies of obligors in grades given in (5.4). Uncorrelated default events 


Y 50% 75% 90% 95% 99% 99.9% 
Central tendency 0.09% 0.17% 0.29% 0.37% 0.57% 0.86% 
K 0.61 0.66 0.60 0.58 0.59 0.59 

Pa 0.05% 0.11% 0.17% 0.22% 0.33% 0.51% 
Pp 0.06% 0.13% 0.20% 0.25% 0.39% 0.58% 


Pc 0.14% 0.24% 0.45% 0.58% 0.89% 1.35% 
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Table 5.18 Upper confidence bound P4 scaled Of PA, PB scaled Of pg and Pc scaled Of Pc as a function of 
the confidence level y after scaling to the upper confidence bound of the overall portfolio PD. No 
default observed, frequencies of obligors in grades given in (5.4). Correlated default events 


Y 50% 75% 90% 95% 99% 99.9% 
Central tendency 0.15% 0.40% 0.86% 1.31% 2.65% 5.29% 
K 0.62 0.65 0.66 0.68 0.70 0.73 

Pa 0.09% 0.26% 0.57% 0.89% 1.86% 3.87% 
Pp 0.11% 0.29% 0.64% 0.98% 2.05% 4.22% 
Pc 0.23% 0.59% 1.25% 1.89% 3.72% 7.19% 


always been set equal. Moreover, our methodology always provides equality 
between the upper bound of the overall portfolio PD and the most prudent estimate 
for pa according to the respective examples of Sects. 5.2 and 5.4. 
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Chapter 6 
Transition Matrices: Properties and Estimation 


Methods 


Bernd Engelmann and Konstantin Ermakov 


6.1 Introduction 


In Chaps. 1-3 estimation methods for l-year default probabilities have been 
presented. In many risk management applications a 1-year default probability 
is not sufficient because multi-year default probabilities or default probabilities 
corresponding to year fractions are needed. Practical examples in the context of 
retail loan pricing and risk management are presented in Chaps. 17 and 18. In other 
applications, like credit risk modelling, rating transitions, i.e. the probability that 
a debtor in rating grade i moves to rating grade j within a period of time, are of 
importance. In all cases, a 1-year transition matrix serves as the starting point. 

In this chapter, we will assume a rating system with n rating grades where the 
n-th grade is the default grade. A 1-year transition matrix is a n x n matrix with the 
probabilities that a debtor in rating grade i migrates to rating grade j within 1 year. 
We start with exploring the properties of transition matrices. Under the assumption 
that rating transitions are Markovian, i.e. that rating transitions have “no memory”, 
and that transition probabilities are time-homogeneous it is possible to compute 
transition matrices for arbitrary time periods. We will show the formulas for this 
calculation in detail. 

These concepts will be illustrated with a numerical example where a 6-month 
transition matrix is computed. We will see from this example that a straightforward 
application of the formulas for computing transition matrices for arbitrary time 
frames can lead to implausible results. We will also see that this is the case for most 
practical examples. To make the calculation of arbitrary transition matrices work in 
practice, a regularization algorithm has to be applied to the original 1-year transition 
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matrix. A number of regularization algorithms exist in the literature. We will present 
one of them that is easy to implement and delivers reasonable results. 

After that, two different estimation methods for transition matrices are pre- 
sented, the cohort method and the duration method. While the cohort method 
directly estimates a l-year transition matrix, the duration matrix estimates the 
generator of the transition matrix, i.e. its matrix logarithm. While in the literature 
it is occasionally claimed that the duration method offers an advantage over the 
cohort method (Jafry and Schuermann 2004), we will show using a simple simula- 
tion study that this is not the case. 


6.2 Properties of Transition Matrices 


A 1-year transition matrix P is a matrix of the form 


Pil Pi2 ses Pin 


P= : : aed : (6.1) 
Pn-1,1 Pn-12 es Pn-\,n 


0 0 0 1 


where p;, is the probability that a debtor migrates from rating grade i to grade j 
within 1 year. The final grade n is the default state which is absorbing, i.e. once a 
debtor has defaulted he cannot migrate back to an alive state but will stay in the 
default state forever. 

A transition matrix P is characterized by the four properties: 


e All entries are probabilities, i.e. 0 < pjj < 1, i, j = 1,..., n. 

e The sum of the entries of each row is one aa Pij = 1. 

e The most right entry of each row p; n is the default probability of rating grade i. 
e The default grade is absorbing, paj = 0, j <n, Pnn = 1. 


The second property can also be interpreted intuitively. If a debtor is in rating 
grade i at the beginning of a period he must be either still in rating grade i, or in 
some other rating grade, or in default at the end of the period. Therefore, all row 
probabilities have to sum to one. 

In practice it can happen that a debtor disappears from the data sample because it 
is no longer rated. This is not considered in a modelling approach. Typically these 
cases are excluded from the data sample or an additional rating grade “Non-rated” 
is introduced to measure the proportion of annual “transitions” into this class. 
However, when the transition matrix is used in a practical application the “Non- 
rated” grade has to be removed and the transition probabilities have to be rescaled 
to sum to one. 

Typically transition matrices refer to a time period of 1 year. In several risk 
management applications multi-year default probabilities are needed. If we assume 
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that the process of rating transitions is stationary and Markovian, it is possible to 
compute multi-year transition matrices. The first property means that the probabil- 
ity for a migration from i to j depends on the length of the observation period only, 
not on its starting point in time. A transition matrix describing rating migrations 
from 1/1/2010 to 1/1/2011 and a matrix corresponding to the time period from 1/1/2012 
to 1/1/2013 are identical. Rating transitions are called Markovian if the migration 
probabilities depend only on the current rating of a debtor, but not on the rating path 
a debtor has passed through during the past years. Both properties of rating 
processes are questionable from an empirical point of view but lead to very 
convenient mathematical structures. This will be illustrated with a simple example. 
Consider a rating system with two rating grades and a 1-year transition matrix 


0.95 0.05 
POV Ge on) 
We compute the 2-year transition matrix. If a debtor survives year one the 


transition matrix for year two is again equal to P because of the stationarity 
property. The possible rating paths are illustrated in Fig. 6.1. 


Fig. 6.1 Possible rating paths Year 1 095 Year 2 0.95 
of a debtor in the simple 
rating system 

a 0.05 0.05 


A debtor in grade 1 can default after year 1, he can survive year one and default 
in year two, and he can survive both years. The sum of the first two paths leads to 
the 2-year default probability 0.05 + 0.05 x 0.95 = 0.0975, while the last path 
leads to the 2-year survival probability 0.95 x 0.95 = 0.9025. This leads to the 
2 year transition matrix 


0.9025 0.0975 
(2) = ( 0.00 1.00 ) 


A closer look to the calculations we have carried out reveals that the 2-year 
transition matrix is the result of the multiplication of the 1-year transition matrix 
with itself. 


P(1)-P(1) = ee oa l w o 


0.00 1.00 0.00 1.00 
7 os 0.95 - 0.05 + ‘nae = Ga — 
E 0.00 1.00 ~ \ 0.00 1.00 


Therefore, arbitrary multi-year transition matrices can be computed by itera- 
tive multiplication of the 1-year transition matrix with itself. Using this, default 
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probabilities for m years can be computed from the 1-year transition matrix. They 
can be read directly from the last column of the m-year transition matrix. 

In some applications more general transition matrices are needed, e.g. a transi- 
tion matrix corresponding to a time period of 3 months. An example is the pricing of 
loans with embedded options which is described in Chap. 18. It is not possible to 
compute transition matrices for arbitrary year fractions with the methods presented 
so far. 

Suppose we would like to compute a 6-months transition matrix. This is equiva- 
lent to computing a square root of the 1-year transition matrix because we know that 
a multiplication of the 6-months transition matrix with itself must result in the 
1-year transition matrix. Therefore, we can write! 


P(1) = P(0.5) - P(0.5) = (P(0.5))* 
P(0.5) = VPO) = P(1) = exp(log(P(1)"*) ) = exp(0.5 -log(P(1))) (6.2) 


In principle, (6.2) can be generalized to arbitrary year fractions t. If the logarithm 
of the 1-year transition matrix would be known arbitrary transition matrices 
could be computed from the exponential. It remains to explain how to compute 
a logarithm and an exponential of a matrix. Both functions are defined for an 
arbitrary matrix X by their Taylor series expansions 


1 1 
exp(X) =I14+X+ st + Te +... (6.3) 

1 2 1 3 
log(X) =X —1-5(X-1) A +... (6.4) 


where / is the identity matrix. Both series have to be evaluated until a reasonable 
accuracy for the logarithm and the exponential is achieved. 

As an example, we compute the 6-months transition matrix of the 1-year matrix 
M given in Fig. 6.2. The matrix is based on Moody’s average 1-year letter rating 
from 1920 to 2007 (Moody’s 2008). In the original document Moody’s (2008), the 
fraction of companies that migrated into the “without rating” state is reported. To 
get the matrix M in Fig. 6.2, this category has to be removed and all probabilities 
have to be rescaled that each row sums to one. The matrix has nine rating grades 
where the ninth grade is the default grade. 

To compute the 6-months transition matrix, the logarithm of M has to be 
computed using (6.4) as a first step. The result is given in Fig. 6.3. For this 
calculation we have used 50 terms in the Taylor expansion (6.4). 

Finally, this matrix has to be multiplied with 0.5 and the exponential of 
the resulting matrix has to be computed using (6.3). This leads to the 6-months 


‘Note that by log(x) we mean the inverse of exp(x), not the logarithm to the base ten. 
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Fig. 6.2 Moody’s average 1-year letter rating migrations from 1920 to 2007 


0.911200 
0.013430 
0.000859 
0.000454 
0.000079 
0.000080 
0.000000 
0.000000 
0.000000 


0.078020 
0.907400 
0.031110 
0.003170 
0.000921 
0.000613 
0.000317 
0.000000 
0.000000 


0.008779 
0.068850 
0.902300 
0.049960 
0.005346 
0.001965 
0.000418 
0.001338 
0.000000 


0.001743 
0.007316 
0.056180 
0.877800 
0.066460 
0.007155 
0.002442 
0.000000 
0.000000 


0.000251 
0.001864 
0.007349 
0.055250 
0.827100 
0.071460 
0.010240 
0.005466 
0.000000 


0.000010 
0.000394 
0.001145 
0.008395 
0.078360 
0.811600 
0.100800 
0.037360 
0.000000 


0.000000 
0.000021 
0.000202 
0.001623 
0.006256 
0.056910 
0.709900 
0.088770 
0.000000 


0.000000 
0.000043 
0.000085 
0.000173 
0.000573 
0.005702 
0.040120 
0.637900 
0.000000 


0.000000 
0.000671 
0.000806 
0.003170 
0.014870 
0.044490 
0.135700 
0.229100 
1.000000 


1 | -0.093630 0.085740 0.006387 0.001426 0.000132 -0.000022 —0.000002 —0.000003 -0.000033 
2 | 0.014750 -0.099110 0.075980 0.005729 0.001653 0.000312 —0.000008 0.000050 0.000642 
3 | 0.000680 0.034320 -0.105900 0.062890 0.006411 0.000739 0.000135 0.000095 0.000657 
4] 0.000463 0.002540 0.056010 -0.134600 0.064520 0.006779 0.001563 0.000137 0.002605 
5 | 0.000059 0.000838 0.003879 0.077820 -0.196500 0.095460 0.004454 0.000295 0.013730 
6 | 0.000083 0.000618 0.001888 0.004980 0.087120 -0.217800 0.074710 0.005705 0.042700 
7 | -0.000009 0.000340 0.000233 0.002322 0.007298 0.130600 -0.351500 0.059400 0.150200 
8 | -0.000002 —0.000071 0.001691 -0.000595 0.004852 0.043130 0.130600 —0.453700 0.274200 
D| 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
Fig. 6.3 Logarithm of the matrix M of Fig. 6.2 
1 2 3 4 5 6 T 8 D 

1 | 0.954400 0.040890 0.003824 0.000789 0.000096 -0.000003 -0.000001 —0.000001 -0.000008 

0.007036 0.952100 0.036150 0.003286 0.000878 0.000176 0.000004 0.000023 0.000328 
3 | 0.000387 0.016330 0.949100 0.029710 0.003462 0.000472 0.000084 0.000045 0.000365 
4 | 0.000229 0.001438 0.026440 0.935900 0.029830 0.003840 0.000793 0.000078 0.001444 
5 | 0.000035 0.000439 0.002336 0.035930 0.907900 0.043200 0.002750 0.000222 0.007154 
6 | 0.000040 0.000307 0.000959 0.003091 0.039410 0.898900 0.032560 0.002887 0.021860 
7 {—0.000002 0.000163 0.000166 0.001182 0.004503 0.057570 0.840700 0.024390 0.071340 
8 |-0.000001 -0.000016 0.000749 -0.000139 0.002598 0.020130 0.053770 0.797800 0.125100 
D | 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 

Fig. 6.4 Six-months transition matrix corresponding to the matrix M of Fig. 6.2 


transition matrix given in Fig. 6.4. Again 50 terms of the Taylor expansion (6.3) 
are used. 

We find that the 6-month transition matrix does not fulfil all the necessary 
properties of transition matrices because it contains negative probabilities. The 
transition probabilities between low grades and high grades are very small but 
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negative numbers. At a first glance, one might suppose that the small negative 
numbers are the result of a numerical instability or inaccuracy in the evaluation of 
(6.3) and (6.4). However, this is not the case. There is an economic reason why it is 
impossible to compute a meaningful 6-month transition matrix from the matrix M. 

In Fig. 6.2 we see that the matrix M contains several transition probabilities 
equal to zero. In the data sample no transitions from rating grade 1 to a grade worse 
than 6 have been observed. Similarly, no rating improvement from grade 7 directly 
to grade 1 has been reported. However, it is possible to migrate within | year, for 
instance, from grade 1 to grade 4. For reasons of consistency, all migration 
probabilities that are zero in the 1-year matrix have to be zero in the 6-months 
matrix. The same is true for the positive probabilities. Under these restrictions, 
however, it is impossible to compute a valid 6-months matrix. In a 6-months matrix 
a transition from grade 1 to grade 4 must have a positive probability and a transition 
from grade 4 to grade 7 must have a positive probability. This implies that the 
1-year transition probability from grade 1 to grade 7 must be positive because 
a debtor can migrate in 6 months from grade | to grade 4 and in the following 
6 months from grade 4 to grade 7. In the matrix M a migration from grade 1 to grade 
7 has a probability of zero which is a contradiction. 

From this example, we see that whenever a 1-year transition matrix contains zero 
entries there is no valid transition matrix for time periods below 1 year.” From the 
theory of Markov chains it is known that transition matrices for arbitrary time 
periods can be computed if the logarithm of the 1-year transition matrix results in 
a generator matrix. 

A matrix G = (gj,)ij=1,....n is called a generator matrix if it has the three 
properties: 


e All diagonal entries are not positive, g;; < 0, i = 1, ..., n. 
e All other entries are not negative, g;; > 0, i,j = 1, ...,n andi Æ j. 
e All row sums are zero X gij =0,i = ys sth 


From the generator matrix, an arbitrary transition matrix P(t) corresponding to 
a time period ¢ is computed as 


P(t) = exp(t-G) 


From Fig. 6.3 we see that the logarithm of the matrix M is no generator matrix. 
Some off-diagonal entries corresponding to rating transitions from very high to very 
low grades and from very low to very high grades are negative. However, the absolute 
value of these negative numbers is small compared to the remaining entries of the 
matrix. 


?This is not true in general because there are cases where it is still possible to compute transition 
matrices for arbitrary time periods if the 1-year matrix contains zeros. The simplest example is the 
identity matrix. However, basically for all practically relevant cases it is true that no consistent 
t-year transition matrix can be computed from the one-year matrix where ¢ is an arbitrary year 
fraction. 
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One idea to solve the problems associated with the non-existence of a generator 
matrix is replacing the logarithm of the 1-year transition matrix by a generator 
matrix that is close to the logarithm matrix. Replacing the logarithm of the 1-year 
matrix by a similar matrix which has the properties of a generator matrix, or 
equivalently, replacing the original 1-year transition matrix by a similar transition 
matrix that allows the calculation of a generator matrix, is called regularization. 
Several suggestions for regularization algorithms have been made in the literature. 
Examples are Israel et al. (2001) and Kreinin and Sidelnikova (2001). 

A very simple regularization algorithm is proposed by Kreinin and Sidelnikova 
(2001). It can be summarized by three steps: 


1. Compute the logarithm of M, G = log(M). 
2. Replace all negative non-diagonal entries of G by zero. 
3. Adjust all non-zero elements of G by: 


g; | p Xai Sij 
Si 
’ Xi lgi,j| 


It is easy to check that the resulting matrix of the above regularization algorithm 
indeed fulfils all properties of a generator matrix. 

In our example, we have seen that the calculation of the logarithm of the matrix 
M of Fig. 6.2 does not lead to a generator matrix. Applying the Steps 2 and 3 of the 
above regularization algorithm to the matrix of Fig. 6.3 leads to the generator 
matrix in Fig. 6.5 below. 

From this generator matrix, we can compute the 6-months transition matrix 
again. The result is presented in Fig. 6.6. We see that now the resulting matrix is 
indeed a transition matrix. All entries are real probabilities taking values inside the 
interval [0, 1]. Finally, we recomputed the 1-year transition matrix from the 
generator matrix of Fig. 6.5 by applying the exponential function to get an impres- 
sion how far the original data have been changed by the regularization algorithm. The 
result is shown in Fig. 6.7. Comparing Figs. 6.2 and 6.7 we see that the regularization 


Sij © Sij 


1 2 3 4 5 6 T 8 D 


1 | -0.093660 0.085720 0.006385 0.001426 0.000132 0.000000 0.000000 0.000000 0.000000 
2] 0.014750 -0.099110 0.075980 0.005728 0.001653 0.000312 0.000000 0.000050 0.000642 
3 | 0.000680 0.034320 -0.105900 0.062890 0.006411 0.000739 0.000135 0.000095 0.000657 
4 | 0.000463 0.002540 0.056010 -0.134600 0.064520 0.006779 0.001563 0.000137 0.002605 
5 | 0.000059 0.000838 0.003879 0.077820 —0.196500 0.095460 0.004454 0.000295 0.013730 
6 | 0.000083 0.000618 0.001888 0.004980 0.087120 -0.217800 0.074710 0.005705 0.042700 
7 | 0.000000 0.000340 0.000233 0.002322 0.007297 0.131800 -0.351500 0.059400 0.150200 
8 | 0.000000 0.000000 0.001690 0.000000 0.004849 0.043100 0.130500 -0.454100 0.274000 
D Įį 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 


Fig. 6.5 Regularization of the matrix log(M) of Fig. 6.3 
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1 2 3 4 5 6 7 8 D 


1 | 0.954400 0.040870 0.003823 0.000789 0.000096 0.000007 0.000001 0.000001 0.000008 
2 | 0.007036 0.952100 0.036150 0.003286 0.000878 0.000176 0.000007 0.000023 0.000328 
0.000387 0.016330 0.949100 0.029710 0.003462 0.000472 0.000084 0.000045 0.000365 
0.000229 0.001438 0.026440 0.935900 0.029830 0.003840 0.000793 0.000078 0.001444 
0.000035 0.000439 0.002336 0.035930 0.907900 0.043200 0.002750 0.000222 0.007154 
0.000041 0.000308 0.000959 0.003092 0.039410 0.898900 0.032560 0.002887 0.021860 
0.000002 0.000164 0.000166 0.001186 0.004503 0.057570 0.840700 0.024380 0.071340 
0.000001 0.000015 0.000752 0.000118 0.002600 0.020110 0.053730 0.797700 0.125000 
0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 
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Fig. 6.6 Six-months transition matrix computed from the generator matrix of Fig. 6.5 


1 2 3 4 5 6 7 8 D 


1 | 0.911200 0.077990 0.008776 0.001743 0.000251 0.000029 0.000003 0.000002 0.000033 
2 | 0.013430 0.907400 0.068850 0.007316 0.001864 0.000395 0.000027 0.000043 0.000672 
3 | 0.000859 0.031110 0.902300 0.056180 0.007349 0.001145 0.000202 0.000085 0.000806 
4 | 0.000454 0.003170 0.049960 0.877800 0.055250 0.008395 0.001623 0.000173 0.003170 
5 | 0.000079 0.000921 0.005346 0.066460 0.827100 0.078360 0.006256 0.000573 0.014870 
6 | 0.000080 0.000614 0.001965 0.007157 0.071460 0.811600 0.056910 0.005701 0.044490 
7 {| 0.000008 0.000319 0.000419 0.002455 0.010240 0.100800 0.709900 0.040120 0.135700 
8 | 0.000003 0.000055 0.001352 0.000446 0.005476 0.037330 0.088690 0.637700 0.228900 
D į 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 


Fig. 6.7 One-year transition matrix computed from the generator matrix of Fig. 6.5 


algorithm had a very mild influence on the input data only. The changes in the 
original data are well below typical statistical errors when transition matrices are 
estimated. It basically replaces the zero transition probabilities by very small 
positive probabilities and adjusts the remaining entries to make sure that all row 
entries sum to one. 

We remark that there might be situations where the regularization algorithm’s 
influence on the original data is much larger. Especially it might change the 1-year 
default probabilities what is unwanted because they are typically tied to a master 
scale that is used in many applications of a bank. Therefore, when computing a 
generator matrix by a regularization algorithm, an additional requirement might be 
to keep the default probabilities unchanged. This can be obtained by adding a fourth 
step to the above regularization algorithm. It is a property of generator matrices that 
if a generator matrix G is multiplied with a diagonal matrix D from the left, then the 
matrix product DG is still a generator matrix. Therefore, the default probabilities 
can be left unchanged by finding an appropriate matrix D using some optimization 
algorithm. A good reference on transition matrices and their generators for further 
reading is Bluhm et al. (2003). 
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6.3 Estimation of Transition Matrices 


Having discussed the mathematical properties of transition matrices in the last 
section, the focus in this section is on the estimation of 1-year transition matrices. 
A good reference on the estimation of 1-year transition matrices is Jafry and 
Schuermann (2004). There are two simple methods of estimating a 1-year transition 
matrix, the cohort method and the duration method. The cohort method directly 
estimates a l-year transition matrix. In practice this might lead to transition 
matrices containing zero probabilities what makes the direct calculation of a 
generator matrix infeasible and the application of a regularization algorithm neces- 
sary. To avoid the need for a regularization algorithm for calculating the generator 
matrix, it is also possible to estimate the generator matrix directly. This is done by 
the duration method. 

To explain both estimation techniques, we assume that we have a data sample 
available that contains a portfolio of firms and the rating history of each firm, i.e. the 
dates where upgrades or downgrades have occurred are stored in a data base. 
An excerpt of the data sample is illustrated in Fig. 6.8. 

In the data history certain reference dates have to be defined that are used to 
define transition periods. For the estimation of a 1-year transition matrix, the length 
of these periods is equal to | year. For each firm in the sample and each time period, 
the rating at the period’s start is observed and the rating at the period’s end. This 
defines one empirical rating transition. We illustrate the concept with some exam- 
ples in Fig. 6.8. Firm 2 is at Y1 in rating grade 2 and at Y2 in rating grade 3. 
Therefore, this is an observation of a rating transition from grade 2 to grade 3. In the 
remaining time, Firm 2 stays in rating grade 3, i.e. from Y2 to Y3 and from Y3 to Y4 
Firm 2 contributes to observations of firms staying in rating grade 3. The treatment of 
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Fig. 6.8 Rating transitions in a hypothetical data sample 
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rating transitions during the year as in the case of firm 5 between Y2 and Y3 depends 
on the specific estimation method. 

Both estimation techniques, the cohort method and the duration method, are the 
result of the maximum likelihood estimation principle. The transition matrix (or 
the generator matrix in case of the duration method) is estimated to maximize the 
probability associated with the empirical data sample. In the case of the cohort 
method, the transition probability from grade i to grade j is estimated as 


a Nij 
Pij = ni (6.5) 


where N; is the number of debtors in rating grade i at the beginning of each time 
period and N; j is the number of rating transitions from rating grade i to rating grade 
j that are observed during the time period. To clarify this concept, we consider again 
Fig. 6.8. In this data sample Firm 5 is downgraded from grade 2 to grade 3 between 
Y2 and Y3 and shortly after the downgrade the company is upgraded again to grade 2. 
In the estimation (6.5) the period from Y2 to Y3 for Firm 5 is counted as an 
observation of a firm that stays in rating grade 2 during this time interval, i.e. 
intermediate observations during the year are ignored by the cohort method. 

The duration method is different in this respect. In this estimation method all 
rating transitions are used in the estimator. The estimator for the generator matrix G 
is given by 


i (T) 
TeS 6.6 
Sri A K;(s)ds ‘ 


where K; ; is the number of all transitions from rating grade i to rating grade j in 
the data sample, T is the length of data set’s time horizon, and K,(s) is the number 
of firms in rating grade i at time s. In contrast to the cohort method, for the duration 
method the splitting of the time frame into l-year periods in Fig. 6.8 is not 
necessary. One simply has to count all transitions in the data sample and approxi- 
mate the integral in (6.6) by counting all firms in each rating grade at a given time 
grid and use the result for calculating the integral. 

In the literature (e.g. Jafry and Schuermann, 2004), it is often considered as an 
advantage of the duration method over the cohort method that all transitions in the 
data sample, also the transitions during the year, can be used in the estimator. In a 
simple simulation study we would like to measure this advantage. We use the 
transition matrix of Fig. 6.7 as a starting point. The basic idea is to simulate rating 
paths from this matrix, estimate the 1-year transition matrix from the simulated 
rating paths, and measure the estimation error using some matrix norm. Since the 
estimation result is known (the transition matrix of Fig. 6.7) we can measure the 
estimation error exactly and can compare the accuracy of the cohort method with 
the accuracy of the duration method. Note, since the duration method estimates the 
generator matrix, we have to compute the 1-year transition matrix from the genera- 
tor matrix before computing the estimation error. 
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We explain how rating paths are simulated. First, we have to define a time grid 
0 = to, ti, ..., ts where ratings should be observed. We will use a homogeneous 
time grid t, = v-At, v = 0, ..., S$, At = t,/s. Since each time interval is identical, it 
is sufficient to compute the transition matrix P(At) = (p; ;(At)) corresponding to the 
time length At. To simulate a rating path, the following steps have to be carried out: 


. Definition of the initial rating k 

. Simulation of an uniformly distributed random number u on [0, 1] 
. Finding the index / with X5;—; pej(At) < u < Yj- pe,(Ad) 

. Setting the rating of the next time point to k = / 

. Repeating the steps 2—4 until a rating is assigned to all time points 


AB WN Re 


We simulate the rating paths for a portfolio of firms and end up with a data 
sample similar to the illustration in Fig. 6.8. 

Finally, we have to define the matrix norm that we use to measure the estimation 
error. If P = (p;j) and Q = (q;,) are two matrices, we define the difference of these 
two matrices by the matrix norm 


1 n n 
||P — O|| = n2 VX ae (Pij = qij). (6.7) 


To compare the two estimation methods, cohort method and duration method, 
we carry out the following steps: 


1. Definition of a portfolio, i.e. definition of the total number of firms N and the 
rating decomposition N; 

2. Definition of a time grid t, where ratings should be observed 

. Simulation of a rating path for each firm 

4. Estimation of the 1-year transition matrix using the cohort method and estima- 
tion of the generator matrix using the duration method together with the calcu- 
lation of the 1-year transition matrix from the result 

5. Calculation of the estimation error for both methods using (6.7) 

6. Carrying out the simulation for several times and calculating average estimation 
errors 


W 


By varying the portfolio size N we can check the dependence of the estimation 
quality of transition matrices on portfolio size. Further, by refining the time grid we 
can measure the advantage of the duration method over the cohort method if there is 
any. We expect that the duration method is the more accurate the more frequently 
the firm ratings are observed. 

We have used portfolios with 1,000, 5,000, 10,000, 25,000, 50,000, and 100,000 
debtors in the first eight rating grades and no debtors in the default grade. We have 
simulated rating paths over a time interval of 3 years and we have used six different 
observation frequencies for the rating, annually, semi-annually, quarterly, monthly, 
weekly, and every 2 days. Our expectation is that the duration method will be the 
more efficient the more ratings are observed during the year. To measure the 
estimation error we have carried out 50 simulations for each combination of portfolio 
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size and observation frequency, computed the estimation error in each simulation 
scenario from (6.7) and averaged over the 50 scenarios. The results are reported in 


Tables 6.1—6.6. 


Table 6.1 Average 
estimation errors for annually 
observation frequency 


Table 6.2 Average 
estimation errors for semi- 
annually observation 
frequency 


Table 6.3 Average 
estimation errors for quarterly 
observation frequency 


Table 6.4 Average 
estimation errors for monthly 
observation frequency 


Table 6.5 Average 
estimation errors for weekly 
observation frequency 


#Debtors Error cohort Error duration 
method method 
1,000 0.000394 0.001414 
5,000 0.000165 0.001393 
10,000 0.000119 0.001390 
25,000 0.000076 0.001387 
50,000 0.000050 0.001387 
100,000 0.000037 0.001388 
#Debtors Error cohort Error duration 
method method 
1,000 0.000377 0.000784 
5,000 0.000181 0.000750 
10,000 0.000123 0.000735 
25,000 0.000076 0.000729 
50,000 0.000053 0.000734 
100,000 0.000037 0.000733 
#Debtors Error cohort Error duration 
method method 
1,000 0.000357 0.000484 
5,000 0.000171 0.000396 
10,000 0.000119 0.000386 
25,000 0.000076 0.000386 
50,000 0.000053 0.000375 
100,000 0.000039 0.000376 
#Debtors Error cohort Error duration 
method method 
1,000 0.000367 0.000348 
5,000 0.000166 0.000186 
10,000 0.000113 0.000163 
25,000 0.000075 0.000141 
50,000 0.000053 0.000133 
100,000 0.000037 0.000130 
#Debtors Error cohort Error duration 
method method 
1,000 0.000366 0.000335 
5,000 0.000163 0.000149 
10,000 0.000126 0.000115 
25,000 0.000079 0.000076 
50,000 0.000051 0.000054 
100,000 0.000038 0.000045 
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Table 6.6 Average ae #Debtors Error cohort Error duration 

estimation errors for bi-daily method method 

observation frequency 1,000 0.000383 0.000354 
5,000 0.000176 0.000159 
10,000 0.000121 0.000111 
25,000 0.000072 0.000064 
50,000 0.000054 0.000051 
100,000 0.000037 0.000035 


We see that the average estimation error of the cohort method converges to zero 
with increasing portfolio size. We also observe that this is not the case for the 
duration method unless the observation frequency is large. If ratings are observed 
annually the duration method contains a substantial estimation bias that cannot be 
reduced by increasing the portfolio size. To reduce the bias in the duration method 
at least weekly observations of ratings are required, a condition hardly met in 
practice. The reason for the poor performance of the duration method is that the 
theory behind this estimator relies on continuous rating paths. Our simulations have 
shown that violating this continuity conditions introduces a simulation bias that can 
be substantial. Therefore, we recommend using the cohort method in practice 
because we do not trust a method that does not converge under practically relevant 
observation frequencies.” 

We remark that in this article we have presented the theory and the estimation of 
transition matrices assuming Markovian rating transitions and time-homogeneous 
transition probabilities. There has been research recently on relaxing one of these 
assumptions or both in modelling rating transitions. An example is Bluhm and 
Overbeck (2007). In some applications multi-year default probabilities are known 
in addition to a 1-year transition matrix. They show how removing the time- 
homogeneity assumption can lead to a satisfactory modelling of rating transitions 
in this situation. 
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Chapter 7 
A Multi-factor Approach for Systematic 
Default and Recovery Risk 


Daniel Rosch and Harald Scheule 


7.1 Modelling Default and Recovery Risk 


Banks face the challenge of forecasting losses and loss distributions in relation to 
their credit risk exposures. Most banks choose a modular approach in line with the 
current proposals of the Basel Committee on Banking Supervision (2004), where 
selected risk parameters such as default probabilities, exposures at default and 
recoveries given default are modelled independently. However, the assumption of 
independence is questionable. Previous studies have shown that default proba- 
bilities and recovery rates given default are negatively correlated [Carey (1998), 
Hu and Perraudin (2002), Frye (2003), Altman et al. (2005), or Cantor and Varma 
(2005)]. A failure to take these dependencies into account will lead to incorrect 
forecasts of the loss distribution and the derived capital allocation. 

This paper extends a model introduced by Frye (2000). Modifications of the 
approach can be found in Pykhtin (2003) and Dullmann and Trapp (2004). Our 
contribution is original with regard to the following three aspects. First, we develop 
a theoretical model for the default probabilities and recovery rates and show how 
to combine observable information with random risk factors. In comparison to the 
above mentioned models, our approach explains the default and the recovery rate 
by risk factors which can be observed at the time of the risk assessment. According 
to the current Basel proposal, banks can opt to provide their own recovery rate 
forecasts for the regulatory capital calculation. Thus, there is an immediate industry 
need for modelling. 
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Second, we show a framework for estimating the joint processes of all variables 
in the model. Particularly, the simultaneous model allows the measurement of the 
correlation between the defaults and recoveries given the information. In this 
model, statistical tests for the variables and correlations can easily be conducted. 
An empirical study reveals additional evidence on the correlations between risk 
drivers of default and recovery. Cantor and Varma (2005) analyze the same dataset 
and identify seniority and security as the main risk factors explaining recovery 
rates. This paper extends their approach by developing a framework for modelling 
correlations between factor-based models for default and recovery rates. 

Third, the implications of our results on economic and regulatory capital are 
shown. Note that according to the current proposals of the Basel Committee, only 
the forecast default probabilities and recovery rates but no correlation estimates, 
enter the calculation of the latter. We demonstrate the effects of spuriously neglect- 
ing correlations in practical applications. 

The rest of the paper is organized as follows. The theoretical framework is 
introduced in the second section (“Model and Estimation”) for a model using 
historic averages as forecasts and a model taking time-varying risk factors into 
account. The third section (“Data and Results”) includes an empirical analysis 
based on default and recovery rates published by Moody’s rating agency and 
macroeconomic indices from the Conference Board. Section four (“Implications 
for Economic and Regulatory Capital”) shows the implications of the different 
models on the economic capital derived from the loss distribution and the regulatory 
capital proposed by the Basel Committee. Section five (“Discussion”) concludes 
with a summary and discussion of the findings. 


7.2 Model and Estimation 


7.2.1 The Model for the Default Process 


Our basic framework follows the approach taken by Frye (2000) and Gordy (2003). 
We assume that n, firms of one risk segment are observed during the time periods 
t ((=1, ..., T). For simplicity, these firms are assumed to be homogenous 
with regard to the relevant parameters and a latent variable describes each obligor 
is (i = 1,..., m;) credit quality 


Si =w-F,+V1—w2- Ui (7.1) 


(w € [0,1]). F; ~ N(O,1) and U, ~ N(0,1) are independent systematic and idiosyn- 
cratic standard normally distributed risk factors. The Gaussian random variable S; 
may be interpreted as the return on a firm’s assets and therefore w? is often called 
“asset correlation”. 

A default event occurs if the latent variable crosses a threshold c 


Si<c (7.2) 
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which happens with probability 2 = M(c) where ®(.) is the standard normal 
cumulative density function. If an obligor is in default, the indicator variable D; 
equals one and zero otherwise: 


1 obligor i defaults in period t 
Di = ; (7.3) 
0 else 


Conditional on the realization f, of the systematic risk factor, default events are 
assumed to be independent between obligors, i.e., each firm defaults with the 
conditional default probability 


n(f;) = P(Di, = IF, =f) (=). (1.4) 


7.2.2 The Model for the Recovery 


In modelling the recovery rate R; of a defaulted obligor, we follow Schönbucher 
(2003) and Dullmann and Trapp (2004) and use a logistic normal process: 


Ry = _exp (Yin) (7.5) 
1+ exp (Yir) 


with the transformed recovery rate 


Yn = p+b-X,4+Zi (7.6) 


where X, ~ N(0,1), Zi ~ N(0,8°) are independent systematic and idiosyncratic 
factors and u and b are parameters. These idiosyncratic factors are independent 
from the idiosyncratic factors which drive the latent default variable. Compared 
to the normal distribution assumption for recovery rates Frye (2000), the chosen 
transformation has the advantage that recovery rates are bounded between 0 and 
100%. Note that any other cumulative density function could be used. As a matter 
of fact, we estimated models using a standard normal transformation and received 
similar results. 

If we observe a homogenous segment of borrowers, the transformed recovery 
rate is given by 


Nt 1 Mt 
t 


z 1 is 
F= X Yr=p+b-X +>) Zu (7.7) 
t j=1 j=l 


Nt 
with Z, =i > Zą which is normally distributed with mean zero and variance 


i=l 
ô? / n. The variance converges for large n, to zero: 
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1 nı 
lim Var| — Zit | =0 78 
Therefore, we approximate the average transformed recovery rate by 


Y, x Y, =u+b-X, (7.9) 


which is driven only by a systematic risk factor and normally distributed Y, ~ N(u, b’). 
The link between the recovery and default process is introduced by modelling the 
dependence of the two systematic risk factors. Since both F, and X, are marginally 
normal distributed, we model their dependence by assuming that they have a 
bivariate normal distribution with correlation parameter p. Alternative, a copula 
which is different from the Gaussian could have been assumed. It then follows that 
the average transformed recovery rate and latent default triggering variable have 
a correlation 


Corr(Si:, Y) = w-p (7.10) 


The correlation equals one in the special case that a single systematic factor 
drives both the default events as well as the recoveries given these events. 


7.2.3 A Multi-factor Model Extension 


So far, we presented a model for systematic risk in defaults and recoveries where 
systematic risk is driven by common factors which are not directly observable. 
These unobservable factors induce uncertainties into the forecasts of loss distri- 
butions. The higher their impact, ceteris paribus, the more skewed the resulting 
distributions are and the higher key risk measures such as the Value-at-Risk or the 
Conditional Value-at-Risk will be. Since the true parameters of the models are 
unknown, the severity of the impact must be estimated from observable data. 

As an alternative to the models above, we analyze a model, which has already 
been used in the context of default modelling. Examples are Rosch and Scheule 
(2004) and Hamerle et al. (2003). These models show that part of the cyclical 
fluctuations in default rates can be attributed to observable systematic risk factors. 
Once these factors are identified and incorporated into the model, a large part of 
uncertainty from unobservable factors can be explained. These types of models 
are also exhibited in Heitfield (2005) and are related to a concept broadly known as 
a point-in-time approach because losses are forecast based on information on the 
prevailing point of the business cycle. 

In our extension, it is assumed that the default threshold for the factor model 
of the default process fluctuates over time. Alternatively, we could introduce a 
factor model with time-varying mean. This variation with time is introduced by K 
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observable macroeconomic risk factors, such as GDP growth or interest rates. We 
assume that these state variables are observed in prior time periods and denote them 
by 22, = (2214, ---,224.«)- As a result, the conditional default probability for 
each borrower within the risk segment is modified (compare Rosch (2003) and 
Heitfield (2005) who additionally condition default probabilities on firm-specific 
factors): 


g% ) +y- a = “Sfi 
r (2af) =P(Di = le? .f) =0(2 m t) (7.11) 


where y = (77, ..., x)’ denotes a vector of exposures to the common observable 
factors and Yo is a constant. The mean of this conditional default probability with 
respect to the unobservable standard normally distributed factor f, is given by 


rei = | w (Pf) dF) = O(c? +y + 24) (7.12) 


In a similar way, we assume that the mean of the log-transformed syste- 
matic recovery rate depends on common macroeconomic factors zÊ; = 
(Eii Bing Bayi This vector may or may not contain factors which also describe 
the default process: 


YY = bot B £ +X (7.13) 


where B = (f,,...,8,,) denotes a vector of exposures and fo the constant. 

If models (12) and (13) hold, i.e., defaults and recoveries are driven by observ- 
able lagged systematic risk factors, it can be shown that their means are fluctuating 
with the change of the economy. Moreover, if these models hold, then model (4) 
and (9) with constant mean are misspecifications. Consequently, fitting model (4) 
and (9) to observable data will have the effect that all time variation is captured 
in the estimates of the exposures to the unobservable random factors F, and X,. On 
the other hand, attributing time variation to observable factors will lead to lower 
parameter estimates for the influences of the unobservable factors, thereby reducing 
uncertainty with regard to the forecasts of the loss distributions. We will demon- 
strate these effects on the economic and regulatory capital below. 


7.2.4 Model Estimation 


Once the models are specified, an algorithm for estimating the parameters from 
observable data is needed. Following work by Frye (2000) we choose the Maximum- 
Likelihood method. Extending these studies, we suggest an ML-procedure which 
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allows the joint estimation of all coefficients, including those of models (11) and (13) 
with observable factors. 

Let us consider a realization f, of the unobservable random factor F,. Given this 
realization the default events are independent and the number of defaults 


D, = 5 Dj; is conditionally binomial distributed with probability distribution 
i=l 


d, dı n-d, 
P(D, =d | f) = (4) mf)” = =a] a= eih (7.14) 


0 else 


with z(f,) as in (7.4). Note that the transformed recovery rate can also be modelled 
given a realization f,. It holds that the random vector (F, Y,) is normally distributed 


with 
F, 0 1 bp 
(l(a) (oD) 
From the law of conditional expectation it follows that Y, has conditional mean 


Mh) =EY,|f) =utb-p-f (7.15) 


and conditional standard deviation 


olf) =+/VarlY; | fi) =b: y1- p? (7.16) 


Hence, the joint density g(.) of d, defaults and a transformed recovery rate 
Yy; given f, is simply the product of the density of y, and the probability of d, i.e., 


g(di, yı | fr) 


2, 
Ah) ml AT ) | (5) a-a P 


Note, g(.) depends on the unknown parameters of the default and the recovery 


process. Since the common factor is not observable we establish the unconditional 
density 


g(di, y 


F 2 
D e =f TROP j (2) A aA af) 
i (7.18) 
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Observing a time series with T periods leads to the final unconditional log- 
likelihood function 


M- 


I(u,b,c,w, p) = In(g(d:, y:)) (7.19) 


t 


This function is optimized with respect to the unknown parameters. In the 
appendix we demonstrate the performance of the approach by Monte-Carlo 
simulations. 

For the second type of models which include macroeconomic risk factors, we 
replace n(f;) from (7.4) by n*(zP fi) from (7.11) and (f) from (7.15) by 
Bo + B' -zÈ +b- p-f; and obtain the log-likelihood /(B 9, B, b, Yo, Y, w, p). 


7.3 Data and Results 
7.3.1 The Data 


The empirical analysis is based on the global corporate issuer default rates and issue 
recovery rates (cf. Moody’s 2005). In this data set, default rates are calculated as the 
ratio of defaulted and total number of rated issuers for a given period. According to 
Moody’s (2005), a default is recorded if 


e Interest and/or principal payments are missed or delayed 
e Chapter 11 or Chap. 7 bankruptcy is filed or 
e Distressed exchange such as a reduction of the financial obligation occurs 


Most defaults are related to publicly traded debt issues. Therefore, Moody’s 
defines a recovery rate as the ratio of the price of defaulted debt obligations 
after 30 days of the occurrence of a default event and the par value. The 
recovery rates are published for different levels of seniority such as total 
(Total), senior secured (S_Sec), senior unsecured (S_Un), senior subordinated 
(S_Sub), subordinated (Sub) and junior subordinated debt. We excluded the 
debt category junior subordinated from the analysis due to a high number of 
missing values. 

In addition, the composite indices published by The Conference Board (http:// 
www.tcb-indicators.org) were chosen as macroeconomic systematic risk drivers, 
i.e., the 


e Index of four coincident indicators (COINC) which measures the health of the 
U.S. economy. The index includes the number of employees on non-agricultural 
payrolls, personal income less transfer payments, index of industrial production 
and manufacturing as well as trade sales. 
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e Index of ten leading indicators (LEAD) which measures the future health of the 
U.S. economy. The index includes average weekly hours in manufacturing, 
average weekly initial claims for unemployment insurance, manufacturers’ 
new orders of consumer goods and materials, vendor performance, manufac- 
turers’ new orders of non-defence capital goods, building permits for new 
private housing units, stock price index, money supply, interest rate spread of 
10-year treasury bonds less federal funds and consumer expectations. 


The indices are recognized as indicators for the U.S. business cycle. Note that for 
the analysis, growth rates of the indices were calculated and lagged by 3 months. 

Due to a limited number of defaults in previous years, the compiled data set was 
restricted to the period 1985-2004 and split into an estimation sample (1985-2003) 
and a forecast sample (2004). Tables 7.1 and 7.2 include descriptive statistics and 
Bravais-Pearson correlations for default rates, recovery rates and time lagged 
macroeconomic indicators of the data set. Note that default rates are negatively 
correlated with the recovery rates of different seniority classes and macroeconomic 
variables. 

Figure 7.1 shows that both, the default and recovery rate fluctuate over time in 
opposite directions. This signals that default and recovery rates show a considerable 
share of systematic risk which can be explained by time varying variables. 

Figure 7.2 contains similar graphs for the recovery rates of the different seniority 
classes. Note that the recovery rates increase with the seniority of a debt issue and 
show similar patterns over time. This indicates that they may be driven by the same 
or similar systematic risk factors. 


Table 7.1 Descriptive statistics of the variables 

Variable Mean Median Max. Min. Std. dev. Skew. Kurt. 
Default rate 0.0176 0.0144 0.0382 0.0052 0.0103 0.6849 2.2971 
Recovery rate (Total) 0.4208 0.4300 0.6170 0.2570 0.0902 0.2883 3.0464 
Recovery rate (S_Sec) 0.5794 0.5725 0.8360 0.3570 0.1379 0.2631 2.0440 


Recovery rate (S_Un) 0.4481 0.4450 0.6280 0.2310 0.1158 —0.1816 2.2725 
Recovery rate (S_Sub) 0.3703 0.3695 0.5190 0.2030 0.0984 —0.1868 1.7668 
Recovery rate (Sub) 0.2987 0.3245 0.4620 0.1230 0.1117 —0.2227 1.7387 
COINC 0.0215 0.0245 0.0409 —0.0165 0.0160 —0.9365 3.0335 
LEAD 0.0130 0.0154 0.0336 —0.0126 0.0151 —0.4568 1.9154 


Table 7.2 Bravais-Pearson correlations of variables 


Variable Default rate Total S_Sec S_Un S_Sub Sub COINC LEAD 
Default rate 1.00 0.67 —0.72 —0.72 —0.53 0.34 —0.75 0.47 
Recovery rate (Total) 1.00 0.78 0.68 0.72 0.29 = 0.32 0.54 
Recovery rate (S_Sec) 1.00 0.66 0.48 0.37 0.33 0.55 
Recovery rate (S_Un) 1.00 0.56 0.42 0.49 0.48 
Recovery rate (S_Sub) 1.00 0.24 0.20 0.40 
Recovery rate (Sub) 1.00 0.41 0.17 
COINC 1.00 0.28 


LEAD 1.00 
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Fig. 7.1 Moody’s default rate vs. recovery rate 
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Fig. 7.2 Moody’s recovery rates by seniority class 


Next to the business cycle and the seniority, it is plausible to presume that 
recovery rates depend on the industry, the collateral type, the legal environment, 
default criteria as well as the credit quality associated with an obligor. Tables 7.3 
and 7.4 show the recovery rates for different industries and issuer credit ratings 
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Table 7.3 Recovery rates for 


` , Industry Recovery rate 
selected industries (Moody’s (1982-2003) 
a Utility-Gas 0.515 
Oil 0.445 
Hospitality 0.425 
Utility-Electric 0.414 
Transport-Ocean 0.388 
Media, broadcasting and cable 0.382 
Transport-surface 0.366 
Finance and banking 0.363 
Industrial 0.354 
Retail 0.344 
Transport-Air 0.343 
Automotive 0.334 
Healthcare 0.327 
Consumer goods 0.325 
Construction 0.319 
Technology 0.295 
Real estate 0.288 
Steel 0.274 
Telecommunications 0.232 
Miscellaneous 0.395 
Table 74 Recovery TAES for Issuer credit rating Recovery rate 
selected issuer credit rating (1982-2004) 
categories (Moody’s 2005) Aa 0.954 
A 0.498 
Baa 0.433 
Ba 0.407 
B 0.384 
Caa-Ca 0.364 


(cf. Moody’s 2004, 2005). Refer to these documents for a more detailed analysis of 
the properties of recovery rates. 


7.3.2 Estimation Results 


Based on the described data set, two models were estimated: 


e Model without macroeconomic risk factors [(7.4) and (7.9)]: we refer to this 
model as a through-the-cycle model because the forecast default and recovery 
rate equal the historic average from 1985 to 2003 

e Model with macroeconomic risk factors [(7.11) and (7.13)]: we refer to this 
model as a point-in-time model because the forecast default and recovery rates 
fluctuate over time 


Within the credit risk community, a discussion on the correct definition of 
a through-the-cycle and point-in-time model exists, in which the present article 
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does not intend to participate. We use these expressions as stylized denominations, 
being aware that other interpretations of these rating philosophies may exist (cf. 


Heitfield 2005). 


Due to the limitations of publicly available data, we use Moody’s global default 
rates, total recoveries, and recoveries by seniority class. Table 7.5 shows the 
estimation results for the through-the-cycle model (4) and (9) and Table 7.6 for 
the point-in-time model (11) and (13) using the variables COINC and LEAD as 


Table 7.5 Parameter estimation results for the through-the-cycle model 


Parameter Total S_Sec S_Un S_Sub Sub 
c —2.0942"*" —2.0951""" —2.0966""* —2.0942"** —2.0940°*" 
(0.0545) (0.0550) (0.0546) (0.0544) (0.0549) 
w 0.2194" 0.2212" 0.2197" 0.2191" 0.2210°"" 
(0.0366) (0.0369) (0.0367) (0.0366) (0.0369) 
u —0.3650°" 0.2976"" 0.2347" 0.5739" —0.8679°"" 
(0.0794) (0.1284) (0.1123) (0.0998) (0.1235) 
b 0.3462°"* 0.5598 0.4898" 0.4351. 0.5384" 
(0.0562) (0.0908) (0.0795) (0.0706) (0.0873) 
p 0.6539" 0.7049°** 0.7520°"* 0.5081" 0.3979" 
(0.1413) (0.1286) (0.1091) (0.1799) (0.2013) 
Annual default and recovery data from 1985 to 2003 is used for estimation 
Standard errors are in parentheses 
“Significant at 1% level 
™ Significant at 5% level 
"Significant at 10% level 
Table 7.6 Parameter estimation results for the point-in-time model 
Parameter Total S_Sec S_Un S_Sub Sub 
Yo —1.9403""" —1.9484""" —1.9089"** —1.9232""* —1.9040°*" 
(0.0524) (0.05210) (0.0603) (0.05660) (0.0609) 
yı —8.5211*"* —8.1786"" —10.078*"* —9,2828""* —10.134""" 
(1.8571) (1.7964) (2.2618) (2.0736) (2.2884) 
COINC COINC COINC COINC COINC 
w 0.1473" 0.15227" 0.1485" 0.1483” 0.1508" 
(0.0278) (0.0286) (0.0276) (0.0277) (0.0279) 
Bo 0.4557 0.1607 0.5576 —0.6621°"" —1.1883°"" 
(0.0867) (0.1382) (0.1635) (0.1194) (0.1845) 
By 7.4191” 11.1867” 15.0807"" 7.2136 14.9625" 
(4.1423) (6.4208) (6.1142) (6.0595) (6.8940) 
LEAD LEAD COINC LEAD COINC 
b 0.3063°"" 0.4960°"" 0.4260°"" 0.4071" 0.4820°"* 
(0.0513) (0.0838) (0.0691) (0.0673) (0.0279) 
p 0.6642"** 0.7346" 0.6675 0.4903" 0.1033 
(0.1715) (0.1520) (0.1481) (0.2088) (0.2454) 


Annual default and recovery data from 1985 to 2003 is used for estimation 
Standard errors are in parentheses 

“Significant at 1% level 

“Significant at 5% level 

"Significant at 10% level 
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explanatory variables. In the latter model we choose both variables due to their 
statistical significance. 

First, consider the through-the-cycle model. Since we use the same default rates 
in each model, the estimates for the default process are similar across models, and 
consistent to the ones found in other studies (compare Gordy (2000) or Rosch 
2005). The parameter estimates for the (transformed) recovery process reflect 
estimates for the mean (transformed) recoveries and their fluctuations over time. 
Most important are the estimates for the correlation of the two processes which are 
positive and similar in size to the correlations between default rates and recovery 
rates found in previous studies. Note that this is the correlation between the 
systematic factor driving the latent default triggering variable “asset return” S;, 
and the systematic factor driving the recovery process. Therefore, higher “asset 
returns” (lower conditional default probabilities) tend to come along with higher 
recovery. A positive value of the correlation indicates negative association between 
defaults and recoveries. The default rate decreases while the recovery rate increases 
in boom years and vice versa in depression years. 

Next, consider the point-in-time model. The default and the recovery process are 
driven by one macroeconomic variable in each model. The parameters of all 
macroeconomic variables show a plausible sign. The negative sign of the COINC 
index in the default process signals that a positive change of the index comes along 
with subsequent lower number of defaults. The positive signs of the variables in the 
recovery process indicate that higher recoveries follow a positive change in the 
variable. In addition, most variables are significant at the 10% level. The only 
exception is the parameter of the macroeconomic index LEAD for the senior 
subordinated recovery rate, which indicates only a limited exposure to systematic 
risk drivers. Note that the influence of the systematic random factor is reduced in 
each process by the inclusion of the macroeconomic variable. While we do not 
mean to interpret these indices as risk drivers themselves, but rather as proxies for 
the future state of the economy, these variables are able to explain part of the 
previously unobservable systematic risk. The remaining systematic risk is reflected 
by the size of w and b and is still correlated but cannot be explained by our proxies. 

Once the point estimates for the parameters are given, we forecast separately the 
defaults and recoveries for year 2004. Table 7.7 shows that the point-in-time model 
leads to forecasts for the default and recovery rates that are closer to the realized 
values than the ones derived from the through-the-cycle model. 


Table 7.7 Forecasts and realizations for year 2004 (through-the-cycle versus point-in-time) 


Parameter Total S_Sec S_Un S_Sub Sub 
Default rate 

Forecast TTC 0.0181 0.0181 0.0180 0.0181 0.0181 
Forecast PIT 0.0162 0.0162 0.0160 0.0162 0.0162 
Realization 0.0072 0.0072 0.0072 0.0072 0.0072 
Recovery rate 

Forecast TTC 0.4097 0.5739 0.4416 0.3603 0.2957 
Forecast PIT 0.4381 0.6159 0.4484 0.3867 0.3014 


Realization 0.5850 0.8080 0.5010 0.4440 0.1230 
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7.4 Implications for Economic and Regulatory Capital 


Since the main contribution of our approach lies in the joint modelling of defaults 
and recoveries, we now apply the forecast default rates, recovery rates for the year 
2004 as well as their estimated correlation to a portfolio of 1,000 obligors. To 
simplify the process, we take the senior secured class as an example and assume 
a credit exposure of one monetary unit for each obligor. 

Figure 7.3 and Table 7.8 compare two forecast loss distributions of the through- 
the-cycle model. To demonstrate the influence of correlation between the processes 
we compare the distribution which assumes independence to the distribution which 
is based on the estimated correlation between the default and recovery rate trans- 
formations of 0.7049. Economic capital or the credit portfolio risk is usually 
measured by higher percentiles of the simulated loss variable such as the 95-, 99-, 
99.5- or 99.9- percentile (95%-, 99%-, 99.5%- or 99.9%-Value-at-Risk). It can be 
seen that these percentiles are considerably higher if correlations between default 
and recovery rates are taken into account. If we take the 99.9%-Value-at-Risk as an 
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Fig. 7.3 Loss distributions for the through-the-cycle model (S_Sec) 


Table 7.8 Descriptive statistics of loss distributions for the through-the-cycle model 


Mean Std. Med 95 99 99.5 99.9 Basel II Basel II Basel II 
dev. capital capital capital 
(standardized) (foundation (advanced 
IRB) IRB) 
Ind. factors 7.82 5.59 6.53 18.55 27.35 31.92 39.02 80.00 74.01 70.08 
Corr. factors 8.73 7.59 6.62 23.81 36.04 42.43 58.75 80.00 74.01 70.08 


Portfolios contain 1,000 obligors with an exposure of one monetary unit each, 10,000 random 
samples were drawn for each distribution with and without correlation between systematic factors 
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example, the percentile under dependence exceeds the percentile under indepen- 
dence by approximately 50%. In other words, if dependencies are not taken into 
account, which is a common feature in many of today’s credit risk models, the 
credit portfolio risk is likely to be seriously underestimated. 

Forecast default and recovery rates can be used to calculate the regulatory capital 
for the hypothetical portfolio. For corporate credit exposures, the Basel Committee 
on Banking Supervision (2004) allows banks to choose one of the following 
options: 


e Standardized approach: regulatory capital is calculated based on the corporate 
issuer credit rating and results in a regulatory capital between 1.6 and 12% of the 
credit exposure. The regulatory capital equals 8% of the credit exposure if firms 
are unrated 

e Foundation Internal Ratings Based (IRB) approach: regulatory capital is calcu- 
lated based on the forecast default probabilities and a proposed loss given default 
for senior secured claims of 45% (i.e., a recovery rate of 55%) and for sub- 
ordinated claims of 75% (i.e., a recovery rate of 25%) 

e Advanced IRB approach: regulatory capital is calculated based on the forecast 
default probabilities and forecast recovery rates 


For the through-the-cycle model, the Standardized approach and the Foundation 
IRB approach result in a relatively close regulatory capital requirement (80.00 vs. 
74.01). The reason for this is that the forecast default rate (0.0181) is close to the 
historic average which was used by the Basel Committee when calibrating regu- 
latory capital to the current level of 8%. The Advanced IRB approach leads to a 
lower regulatory capital (70.08 vs. 74.01) due to a forecast recovery rate which is 
higher than the assumption in the Foundation IRB approach (57.39% vs. 55%). 
Note that Foundation IRB’s recovery rate of 55% is comparable to the average 
recovery rate of the senior secured seniority class but is proposed to be applied to 
both the senior secured (unless admitted collateral is available) as well as the senior 
unsecured claims. This could indicate an incentive for banks to favour the Founda- 
tion approach over the Advanced IRB approach especially for senior unsecured 
credit exposures. Similar conclusions can be drawn for the Foundation IRB’s 
recovery rate of 25% which will be applied for both senior subordinated as well 
as subordinated claims. 

Figure 7.4 and Table 7.9 compare the respective loss distributions with and 
without correlations using the point-in-time model. 

It can be observed that the economic capital, expressed as Value-at-Risk, is 
considerably lower for the point-in-time model than for the through-the-cycle 
model. The reasons are twofold. First, the inclusion of macroeconomic variables 
leads to a lower forecast of the default rate (1.62%), a higher forecast of the 
recovery rate (61.59%) for 2004 and therefore to lower expected losses. Second, 
the exposure to unknown random systematic risk sources is reduced by the inclu- 
sion of the observable factors. This leads to less uncertainty in the loss forecasts 
and therefore to lower variability (measured, e.g., by the standard deviation) of 
the forecast distribution. Moreover, the regulatory capital is the lowest for the 
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Fig. 7.4 Loss distributions for the point-in-time model (S_Sec) 


Table 7.9 Descriptive statistics of loss distributions for the point-in-time model 


Mean Std. Med 95 99 99.5 99.9 Basel II Basel II Basel II 
dev. capital capital capital 
(standardized) (foundation (advanced 
IRB) IRB) 
Ind. factors 6.33 3.61 5.64 13.10 18.01 20.43 25.77 80.00 71.16 60.74 
Corr. factors 6.78 4.71 5.64 16.03 22.78 25.60 31.77 80.00 71.16 60.74 


Portfolios contain 1,000 obligors with an exposure of one monetary unit each, 10,000 random 
samples were drawn for each distribution with and without correlation between systematic factors 


Advanced IRB approach which takes both the forecast default and recovery rate 
into account. 

We also notice another important effect. The economic capital, measured by the 
higher percentiles of the credit portfolio loss, increases if the estimated correlation 
between the default and recovery rates is taken into account. This increase is not as 
dramatic as in the through-the-cycle model, although the correlation between risk 
factors of defaults and recoveries has slightly increased. The inclusion of macro- 
economic factors renders the systematic unobservable factors less important and 
diminishes the impact of correlations between both factors. To the extent that 
recoveries and defaults are not exposed at all to unobservable random factors, the 
correlations between these factors are negligible for loss distribution modelling. 
Figure 7.5 shows this effect. We assumed constant exposure of b = 0.5 to the 
recovery factor and varied the exposure to the systematic factor for the defaults 
(asset correlation) for given correlation between the systematic factors. The bench- 
mark case is a correlation of zero between the factors. Here, we notice a reduction 
of economic capital from 44 (i.e., 4.4% of total exposure) for an asset correlation of 
0.1 to 13 (1.3%) when the asset correlation is zero. In the case of a correlation 
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Fig. 7.5 Economic capital gains from decrease in implied asset correlation for correlated risk 
factors; Figure shows 99.9 percentiles of loss distributions for the senior secured seniority class 
depending on asset correlation and correlation of systematic risk factors. Portfolio contains 1,000 
obligors each with default probability of 1%, exposure of one monetary unit, and expected 
recovery of 50% 


between the factors of 0.8, the Value-at-Risk is reduced from 61 (6.1%) to 13 
(1.3%). Thus, the higher the correlation of the risk factors, the higher the economic 
capital gains are from lowering the implied asset correlation by the explanation 
with observable factors. 


7.5 Discussion 


The empirical analysis resulted in the following insights: 


1. Default events and recovery rates are correlated. Based on an empirical data set, 
we found a positive correlation between the default events and a negative 
correlation between the default events and recovery rates. 

2. The incorporation of the correlation between the default events and recovery 
rates increases the economic capital. As a result, most banks underestimate their 
economic capital when they fail to account for this correlation. 

3. Correlations between defaults decrease when systematic risk drivers, such as 
macroeconomic indices are taken into account. In addition, the impact of 
correlation between defaults and recoveries decreases. 

4. As a result, the uncertainty of forecast losses and the economic capital measured 
by the percentiles decreases when systematic risk drivers are taken into account. 
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Most empirical studies on recovery rates (including this article) are based on 
publicly available data provided by the rating agencies Moody’s or Standard and 
Poor’s and naturally lead to similar results. The data sets of the rating agencies are 
biased in the sense that only certain exposures are taken into account. Typically, 
large U.S. corporate obligors in capital intensive industries with one or more 
public debt issues and high credit quality are included. Thus, the findings can not 
automatically be transferred to other exposure classes (e.g., residential mortgage or 
credit card exposures), countries, industries or products. 

Moreover, the data is limited with regard to the number of exposures and periods 
observed. Note that our assumption in (7.8) of a large number of firms is crucial 
since it leads to the focus on the mean recovery. If idiosyncratic risk can not be fully 
diversified the impact of systematic risk in our estimation may be overstated. Due to 
the data limitations, we cannot draw any conclusions about the cross-sectional 
distribution of recoveries which is often stated to be U-shaped (see, e.g., Schuermann 
2003). In this sense, our results call for more detailed analyses, particularly with 
borrower-specific data which possibly includes financial ratios or other obligor 
characteristics and to extend our methodology to a panel of individual data. As a 
result, we would like to call upon the industry, i.e., companies, banks and regulators 
for feedback and a sharing of their experience. 

In spite of these limitations, this paper provides a robust framework, which 
allows creditors to model default probabilities and recovery rates based on certain 
risk drivers and simultaneously estimates interdependences between defaults and 
recoveries. It can be applied to different exposure types and associated information 
levels. Contrary to competing models, the presence of market prices such as bond or 
stock prices is not required. 


Appendix: Results of Monte-Carlo Simulations 


In order to prove the reliability of our estimation method, a Monte-Carlo simulation 
was set up which comprises four steps: 


e Step 1: Specify model (1) and model (9) with a given set of population para- 
meters w, c, b, u, and p. 

e Step 2: Draw a random time series of length T for the defaults and the recoveries 
of a portfolio with size N from the true model. 

e Step 3: Estimate the model parameters given the drawn data by the Maximum- 
Likelihood method. 

e Step 4: Repeat Steps 2 and 3 for several iterations. 


We used 1,000 iterations for different parameter constellations and obtained 
1,000 parameter estimates which are compared to the true parameters. The portfolio 
consists of 10,000 obligors. The length of the time series T is set to T = 20 years. 
We fix the parameters at w = 0.2, u = 0.5, and b = 0.5 and set the correlations 
between the systematic factors to 0.8, 0.1, and —0.5. In addition, we analyze three 
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rating grades A, B, and C where the default probabilities and thresholds c in the 
grades are: 


e A: 2z=0.005, i.e., c = —2.5758 
e B:x=0.01, i.e., c = —2.3263 
e C:2=0.02, i.e., c = —2.0537 


Table 7.10 contains the results from the simulations. The numbers without 
brackets contain the average of the parameter estimates from 1,000 simulations. 
The numbers in round (.)-brackets represent the sample standard deviation of the 
estimates (which serve as an approximation for the unknown standard deviation). 
The numbers in square [.]-brackets give the average of the estimated standard 
deviations for each estimate derived by Maximum-Likelihood theory. It can be 
seen in each constellation that our ML—approach for the joint estimation of the 
default and recovery process works considerably well: the averages of the estimates 
are close to the originally specified parameters. Moreover, the estimated standard 
deviations reflect the limited deviation for individual iterations. The small down- 
ward bias results from the asymptotic nature of the ML-estimates and should be 
tolerable for practical applications. 


Table 7.10 Results from Monte-Carlo simulations 


c w H b p 
Grade p 
A 0.8 —2.5778 0.1909 0.4991 0.4784 0.7896 
(0.0495) (0.0338) (0.1112) (0.0776) (0.1085) 
[0.0468] [0.0317] [0.1070] [0.0756 [0.0912 
0.1 —2.5789 0.1936 0.4970 0.4824 0.1139 
(0.0484) (0.0336) (0.1154) (0.0788) (0.2269) 
[0.0475] [0.0322] [0.1079] [0.0763] [0.2185 
—0.5 —2.5764 0.1927 0.5048 0.4826 —0.4956 
(0.0492) (0.0318) (0.1116) (0.0798) (0.1923) 
[0.0472] [0.0320] [0.1078] [0.0763] [0.1697 
B 0.8 —2.3287 0.1927 0.4999 0.4852 0.7951 
(0.0480) (0.0327) (0.1104) (0.0774) (0.0920) 
[0.0460] [0.0306] [0.1084] [0.0765 [0.0856 
0.1 —2.3291 0.1906 0.4927 0.4831 0.0861 
(0.0472) (0.0306) (0.1105) (0.0778) (0.2330) 
[0.0456] [0.0305] [0.1080] [0.0764 [0.2152 
—0.5 —2.3305 0.1900 0.4988 0.4805 —0.4764 
(0.0479) (0.0324) (0.1115) (0.0806) (0.1891) 
[0.0453] [0.0303] [0.1074] [0.0759] [0.1703 
C 0.8 —2.0536 0.1935 0.4972 0.4855 0.7915 
(0.0489) (0.0315) (0.1104) (0.0804) (0.0956) 
[0.0448] [0.0297] [0.1080] [0.0763] [0.0843 
0.1 —2.0542 0.1943 0.5030 0.4851 0.1067 
(0.0580) (0.0382) (0.1168) (0.0782) (0.2374) 
[0.0448] [0.0298] [0.1085] [0.0770 [0.2128 
—0.5 —2.0554 0.1923 0.4998 0.4833 —0.4898 
(0.0510) (0.0359) (0.1085) (0.0852) (0.1815) 
[0.0443] [0.0295] [0.1076] [0.0766 [0.1656 
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Chapter 8 
Modelling Loss Given Default: A “Point 
in Time’’-Approach 


Alfred Hamerle, Michael Knapp, and Nicole Wildenauer 


8.1 Introduction 


In recent years the quantification of credit risk has become an important topic in 
research and in finance and banking. This has been accelerated by the reorganisa- 
tion of the Capital Adequacy Framework (Basel II).' Previously, researchers and 
practitioners mainly focused on the individual creditworthiness and thus the deter- 
mination of the probability of default (PD) and default correlations. The risk 
parameter LGD (loss rate given default) received less attention. Historical averages 
of LGD are often used for practical implementation in portfolio models. This 
approach neglects the empirical observation that in times of a recession, not only 
the creditworthiness of borrowers deteriorates and probabilities of default increase, 
but LGD also increases. Similar results are confirmed in the empirical studies by 
Altman et al. (2003), Frye (2000a), and Hu and Perraudin (2002). If LGD is only 
integrated in portfolio models with its historical average, the risk tends to be 
underestimated. Hence, adequate modelling and quantification of LGD will become 
an important research area. This has also been advocated by Altman and Kishore 
(1996), Hamilton and Carty (1999), Gupton et al. (2000), Frye (2000b), and 
Schuermann (2004). 

The definitions of the recovery rate and the LGD have to be considered when 
comparing different studies of the LGD, since different definitions also cause different 
results and conclusions. Several studies distinguish between market LGD, implied 


‘Basel Committee on Banking Supervision (2004). 
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market LGD and workout LGD.” This paper uses recovery rates from Moody’s 
defined as market recovery rates. 

In addition to studies which focus only on data of the bond market or data of 
bonds and loans,’ there are studies which focus on loans only.* Loans generally 
have higher recovery rates and therefore lower values of LGD than bonds.” This 
result relies especially on the fact that loans are more senior and in many cases also 
have more collectible collaterals than bonds. 

Studies show different results concerning the factors potentially determining the 
LGD which are presented briefly below. The literature gives inconsistent answers to 
the question if the borrower’s sector has an impact on LGD. Surveys such as 
Altman and Kishore (1996) confirm the impact of the sector. Gupton et al. (2000) 
conclude that the sector does not have an influence on LGD. They trace this finding 
back to the fact that their study only examines loans and not bonds. 

The impact of the business cycle is approved by many authors, e.g. Altman et al. 
(2003), Varma and Cantor (2005), Acharya et al. (2007), Grunert and Weber 
(2009), and Bruche and Gonzalez-Aguado (2010). In contrast, Asarnow and 
Edwards (1995) conclude that there is no cyclical variation in LGD. Comparing 
these studies one has to consider that different data sources have been used, and the 
latter only focused on loans. 

Several studies support the influence of the borrower’s creditworthiness or the 
seniority on LGD.° Nearly all studies analysing LGD using empirical data calculate 
the mean of the LGD per seniority, per sector, per rating class or per year. 
Sometimes the means of the LGD per rating class and per seniority are calculated. 
We refer to the latter prices as “matrix prices” sometimes enabling a more accurate 
determination of LGD than the use of simple historical averages.’ The authors 
agree that the variance within the classes is high and there is a need for more 
sophisticated models. Altman et al. (2003) suggest a first extension of the model by 
using a regression model with several variables as the average default rate per year 
or the GDP growth to estimate the average recovery rate. 

The present paper makes several contributions. A dynamic approach for LGD is 
developed which allows for individual and time dependent LGDs. The model 
provides “‘point-in-time” predictions for the next period. The unobservable part of 
systematic risk is modelled by a time specific random effect which is responsible 
for dependencies between the LGDs within a risk segment in a fixed time period. 


?For a definition of these values of LGD see Schuermann (2004) and Basel Committee on Banking 
Supervision (2005). 

3Schuermann (2004). 

4’Asarnow and Edwards (1995), Carty and Lieberman (1996), Carty et al. (1998). 

>Gupton et al. (2000). 


®Carty and Lieberman (1996), Carty et al. (1998), Gupton et al. (2000), Altman (2006), Roesch and 
Scheule (2008). 


TAraten et al. (2004), Gupton et al. (2000), Schuermann (2004). 
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Furthermore, the relationship between issuer specific rating developments and LGD 
can be modelled adequately over time. 

The rest of this chapter is organised as follows: Sect. 8.2 states the statistical 
modelling of the LGD. Section 8.3 describes the dataset and the model estimations. 
Section 8.4 concludes and discusses possible fields for further research. 


8.2 Statistical Modelling 


The dataset used in this chapter mainly uses bond data. Recovery rates will be 
calculated as market value of the bonds | month after default. The connection 
between LGD and recovery rate can be shown as: 


LGD yi) — 1 — Ri- 


Here, LGD qn and Ra denote the LGD and recovery rate of bond i that defaults 
in year t, i=1,...,n,. The number of defaulted bonds in year t, t=1,...,7 is denoted 
with 7,. 

The resulting recovery rates and loss rates normally range between O and 1, 
although there are exceptions.® Firstly, the LGDs will be transformed. The trans- 
formation used in this chapter is 


= LGD) 
Ji) = 87 ECD x . 


Written in terms of the recovery rate, the following relation is obtained: 


1— Rii Rii 
yyi) = lo = —lo 
OT OE Ro ETRO 


This logit transformation of the recovery rate is also proposed by Schönbucher 
(2003) and Dullmann and Trapp (2004).° The LGD can be written as: 


HOD, = SPH) 
O = TF expo) 


Recovery rates greater than one are unusual. In these cases the bond is traded above par after the 
issuer defaults. These values are excluded from the dataset in the empirical research, see 
Sect. 8.3.1. 


°This transformation ensures a range between 0 and 1 of the estimated and predicted LGD. 
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Analogous to the model used in Basel II, the following approach for the 
transformed values y,; is specified (Düllmann and Trapp 2004): 


Yi) = U+ oVof, + oV1 — wei) (8.1) 


The random variables f, and €x; are standard normally distributed. All random 
variables are assumed to be independent. The parameter o is non-negative and 
values of œ are restricted to the interval [0, 1]. 

Other specifications are also discussed. Frye (2000a) suggests an approach 
according to (8.1) for the recovery rate itself. Pykthin (2003) assumes log-normally 
distributed recovery rates and chooses a specification like (8.1) for log(R jj). 

In the next step, model (1) is extended including firm and time specific observ- 
able risk factors. The dependence upon the observable risk factors is specified by 
the following linear approach: 


Lyi) = Bo + B'x,-1(i) + YZ, (8.2) 


where i=1,...,1,, t=1,...,7. Here x,4(,) characterises a vector of issuer and bond 
specific factors observed in previous periods. Examples for these issuer and bond 
specific variables are the issuer rating of the previous year or the seniority. By z,_; 
we denote a vector of macroeconomic variables representing potential systematic 
sources of risk. The macroeconomic variables are included in the model with a time 
lag. Generally it can be assumed that regression equation (8.2) holds for a pre- 
defined risk segment, e.g. a sector. 

Regarding (8.1) and (8.2) it can be seen that the logit transformed values of LGD 
are normally distributed with mean uy; and variance o°. The random time effects f, 
cause a correlation of the transformed values of LGD yx) of different bonds 
defaulting in year t. This correlation shows the influence of systematic sources of 
risk which are not explicitly included in the model or which affect LGD contempo- 
rarily. If fundamental factors are having an impact on the LGD of all defaulted 
bonds — at least in one sector — a correlation of LGD is obtained as a result (as long 
as these systematic risk factors are not included in the model). It can be seen that the 
factors have different effects in different segments, e.g. different time lags or 
sensitivities in different sectors. If in contrast, the relevant systematic risk factors 
are included in the vector z,.; and if no other risk factors influence LGD contempo- 
rarily, the impact of time effects should be reduced significantly. 

The unknown parameters in (8.1) and (8.2) are estimated by maximum likeli- 
hood considering (8.1) — extended by (8.2) — as a panel regression model with 
random effects, (Baltagi 1995, Chap. 3). Note that a bond specific random effect 
does not enter the model, since defaulted bonds in different periods t¢ and s (t4s) are 
different. Parameter estimates are obtained using PROC MIXED in SAS.” 


OWolfinger et al. (1994). 
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For the covariance and correlation of the transformed values of LGD in year t, 
the following relationships hold: 


Cov (Yi), Yi(i)) =co 
Corr(y,i),¥j)) =O, TAS. 


8.3 Empirical Analysis 


8.3.1 The Data 


A dataset from Moody’s Default Risk Service is used for empirical analyses. It 
contains data from about 2,000 defaulted debt obligations, i.e. bonds, loans and 
preferred stock from 1983 to 2003. More than 1,700 debt obligations are from 
American companies. 

The dataset includes information about the recovery rates of defaulted bonds. 
The LGD and the transformed LGD used in this analysis can be calculated from the 
recovery rate as described in Sect. 8.2. When a borrower defaulted for the first time, 
this event was recorded and all default events after the first one are not considered in 
this study.'! 

About 90% of these debt obligations are bonds. To ensure a homogenous dataset, 
only bonds are used in this study. For the same reason, only data from companies in 
the sector “industry”!* are used in the final analysis. In this sector there are 84% of 
the bonds. In the sectors “financial service providers” and “sovereign/public utility” 
there are fewer defaulted borrowers and therefore fewer defaulted bonds. After 
restricting the data to American bonds in the (aggregated) sector “industry”, there 
are 1,286 bonds in the dataset. Additionally, the dataset is limited to bonds with a 
debt rating of “Ba3” or worse. The reason for this constraint was that the rating 
categories “A3” to “Ba2” have sparse observations in several years of the period 
1983-2003. In addition, several defaulted issuers hold five or more bonds. Some of 


"This constraint naturally only affects borrowers who defaulted several times. Furthermore, 
observations with LGD equal to zero and negative LGD are excluded from the analysis, because 
the transformed LGD yx; cannot be calculated. If the recovery rate is greater than 1, i.e. if the 
market value of a bond one month after default is greater than the nominal value of the bond, the 
LGD becomes negative. In the dataset this was the case in 0.5% of all observations. 


The (aggregated) sector “industry” contains the sectors “industrial”, “transportation” and “other 
non-bank” of Moody’s sectoral classification (with 12 sectors) in Moody’s Default Risk Service 
(DRS) database. For reason of completeness one has to know that there are two other aggregated 
sectors. On the one hand there is the (aggregated) sector “financial service providers” containing 
the sectors “banking”, “finance”, “insurance”, “real estate finance”, “securities”, “structured 
finance” and “thrifts” and on the other hand the (aggregated) sector “sovereign/public utility” 
containing the sectors “public utility” und “sovereign”. This aggregation was made as several 


sectors did not have enough observations. 
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these bonds have the same LGD at the time of default although they have distinct 
debt ratings or distinct seniorities. Other bonds have a different LGD although they 
dispose of the same issuer and debt rating and the same seniority. These differences 
cannot be explained with the data at hand. Probably they can be traced back to 
issuer’s attributes not available in the dataset. For this reason, only issuers with four 
or fewer bonds remain in the dataset.'* Additionally, bonds of companies with 
obvious cases of fraud like Enron or Worldcom were eliminated from the dataset to 
ensure a homogenous pool. 

Subsequently, the dataset is adjusted marginally. On the one hand, there is only 
one bond with a rating “B2” defaulting in 1996. This bond has a very small LGD 
and is removed from the dataset because it could cause a biased estimation of 
random effects. On the other hand, four bonds having a bond rating of “Ca” and “C” 
in the years 1991, 1992 and 1995 are eliminated from the dataset because they also 
have only one or two observations per year. Consequently, there are 952 bonds from 
660 issuers remaining in the dataset. 

The random effect f, and the error term £x; are assumed to be independent, with a 
standard normal distribution as described in Sect. 8.2. The transformed LGD yy) is 
tested for an approximately normally distribution. As a result, a normal distribution 
of the data can be assumed. This distribution can also be confirmed when the 
distribution of yxp by year is tested. 

In the analysis, the influence of issuer- and bond-specific variables x;_4(;) is 
examined as mentioned in Sect. 8.2. In the analyses the following variables are 
tested: 


e Issuer rating: Moody’s estimated senior rating has 21 grades between “Aaa” 
(highest creditworthiness) and “C” (low creditworthiness).'* An aggregation 
of the rating categories is tested as well. A possible classification would be the 
distinction between investment grade ratings (rating “Aaa” to “Baa3”) and 
speculative grade ratings (rating “Bal” to “C”). Besides this relatively rough 
classification the ratings are classified into the categories “Aaa” to “A3”, “Baal” 
to “Baa3”, “Bal” to “Ba3”, “B1” to “B3”, “Caa”, “Ca” and “C”. The issuer 
rating has a time lag of 1 year in the analyses. 

e Debt rating: Its classification is analogous to the issuer rating and has a time lag 
of 1 year. In addition to the classifications mentioned above, the ratings are 
classified into the categories “Ba3” to “B3” and “Caa” to “C”. 


'3In principle, only issuers with one bond could be left in the dataset if the effect of several bonds 
per issuer should be eliminated. As this restriction would lead to relatively few observations, only 
issuers with five or more bonds are excluded. Hence the dataset is only diminished by 4%. 

'4Eor withdrawn ratings, Moody’s uses a class “WR”. Because of the lagged consideration of 
rating there are no bonds in the dataset with rating “WR” one year before default. 

'SMoody’s used to name this rating class with “Caa” until 1997. Since 1998, this class has been 
separated into the three rating classes “Caa1”, “Caa2” and “Caa3”. To use the data after 1998, the 
latter three ratings have been aggregated in one rating class which is named “Caa” in the following. 
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e Difference between issuer and debt rating: the fact that the issuer rating is one, 
two, three or more than three notches better than the debt rating is tested on its 
impact on the transformed LGD. Additionally, the impact of the fact that the 
issuer rating is better or worse than the debt rating is tested. The rating classifi- 
cation of an issuer and a bond can differ if the bond finances a certain project 
which has a different risk and solvency appraisal compared to the issuer. 

e Seniority: Starting with Moody’s classification, the classes “senior secured”, 
“senior unsecured”, “senior subordinated”, “subordinated” and “junior subordi- 
nated” are distinguished.'® To distinct these seniority classes from the relative 
seniority, they are sometimes referred to as absolute seniority. 

e Relative seniority: According to Gupton and Stein (2005) the relative impor- 
tance of the seniority is surveyed. This variable can be best explained by an 
example: If issuer 1 has two bonds — one is secured “subordinated” and the other 
“junior subordinated” — and issuer 2 has three bonds — one with seniority “senior 
secured”, another with “senior subordinated” and the third bond with seniority 
“subordinated” — then the “subordinated” bond from issuer | is going to be 
served first and possibly has a lower LGD than the bond with seniority “sub- 
ordinated” from issuer 2 which is served after the two other bonds from issuer 2. 

e Additional backing by a third party: If the bond is secured additionally by a third 
party beside the protection by the issuer emitting the bond, then this information 
is also used in the analyses. 

e Maturity (in years): The maturity of the bond is calculated as the difference of 
the maturity date and the default date. It indicates the remaining time to maturity 
if the bond would not have defaulted. 

e Volume of defaulted bond (in million dollars): The number of outstanding 
defaulted bonds times the nominal of this defaulted bond denotes the volume 
of the defaulted bond. It quantifies the influence of the volume of one defaulted 
bond, not the influence of the volume of defaulted bonds in the market altogether. 
Certain companies like insurances are not allowed to hold defaulted bonds. On 
the other hand, there are speculative investors who are interested in buying 
defaulted bonds. The higher the volume of the defaulted bond, the higher the 
supply of defaulted bonds on the market. Therefore it can be more difficult for 
the defaulted issuers to find enough buyers or to claim high prices for the 
defaulted bond. 

e Issuer domicile: The country of the issuer is implicitly considered by the 
limitation on American data. This limitation can be important because different 
countries may be in different stages of the economic cycle in the same year. 
If the data is not limited to a certain country, the macroeconomic condition of all 
countries included in the dataset should be considered. Additionally, different 
legal insolvency procedures exist in different countries, so that a country’s legal 
procedure can influence the level of recovery rates and LGD. 


‘For a consideration of the hierarchy of seniority classes see Schuermann (2004, p. 10). 
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Fig. 8.1 Average LGD per year for bonds in the (aggregated) sector “industry” 


In Fig. 8.1 the average (realised) LGD for bonds in the (aggregated) sector 
“industry” per year in the period 1983-2003 are depicted: 

As can be seen from Fig. 8.1, the LGD is obviously underlying cyclical varia- 
bility. This is why the cyclical variations of LGD are explained with the help of 
macroeconomic variables in the vector Z,.;. Therefore, a database with more than 60 
potential macroeconomic variables is established. It contains interest rates, labour 
market data, business indicators like gross domestic product, consumer price index 
or consumer sentiment index, inflation data, stock indices, the Leading Index etc.!” 
In addition, the average default rate per year of the bond market is taken into 
account. All variables are included contemporarily and with a time lag of at least 1 
year. The consideration of these variables should enable a “point-in-time” model. 


8.3.2 Results 


Two different model specifications for the (aggregated) sector “industry” are 
examined.'* In contrast to model (8.1), another (but equivalent) parameterisation 
is used. The models can be instantaneously estimated with the procedure MIXED in 
the statistical program SAS. In the next step, the parameter estimates for o and w 
can be determined from the estimates for b} and b). Table 8.1 summarises the 
results. 


17A list of potential macroeconomic factors can be found in the appendix. 

'8 Additionally, models for all sectors are estimated containing dummy variables for the different 
sectors in addition to the variables mentioned below. The use of a single sector leads to more 
homogenous data. 


8 Modelling Loss Given Default: A “Point in Time’”-Approach 145 


Table 8.1 Parameter estimates and p-values (in parentheses) for models I and II (only bonds of 
the (aggregated) sector “industry”’) 


Model I Model II 

AIC 3,224.3 3,222.8 

b 1.7336 (<0.0001) 1.7327 (<0.0001) 
b? 0.3421 (0.0052) 0.2859 (0.0064) 
Constant —0.3868 (0.1146) —0.8697 (0.0164) 
Debt rating “Ba3” to “B3” (t—1) —0.1938 (0.0463) —0.1783 (0.0672) 
Seniority “senior unsecured” 0.6194 (0.0004) 0.6064 (0.0005) 
Seniority “senior subordinated” 0.7061 (0.0002) 0.6909 (0.0002) 
Seniority “subordinated” and “junior subordinated” 1.0487 (<0.0001) 1.0443 (<0.0001) 
Relative Seniority “2” and “3” 0.5041 (0.0001) 0.5084 (<0.0001) 
Additional backing by a third party —0.2717 (0.0325) —0.2697 (0.0338) 
Bond maturity (in years) 0.03407 (0.0020) 0.03546 (0.0013) 
Volume of defaulted bonds (in million dollars) 0.001118 (0.0001) 0.001087 (0.0002) 
Average default rate (in percent) (t — 1) 0.2186 (0.0358) 


Model I: ya = fo + B'xi) + bift + begi). 
Model II: ya = Bo + B'xi + Yz + bif + bze). 


The results of models I and II can be interpreted as follows”: If a bond is rated 
“Ba3”, “B1”, “B2” or “B3” | year before default, it has a significantly smaller LGD 
than a bond with rating “Caa”, “Ca” or “C”. In addition to the debt rating, the 
seniority also affects LGD. Bonds with seniority “senior unsecured” as well as 
bonds secured “senior subordinated”, “subordinated” or “junior subordinated” have 
a significantly higher LGD than “senior secured” bonds. When the seniority classes 
are compared, it can be stated that “senior unsecured” bonds have a smaller LGD 
than “senior subordinated” bonds. Bonds secured “subordinated” or “junior sub- 
ordinated” have the highest LGD. Using well secured bonds a creditor can exploit 
better securities than a creditor secured with lower ranked bonds resulting in lower 
losses. Generally, this result sustains the results published by Moody’s.”° 

However, not only the (absolute) seniority, but also the relative seniority affects 
LGD. If a bond is ranked second or third in terms of collateralisation, the LGD of 
this bond is significantly higher than the LGD of a bond secured at first rank. If the 
company is going to be commercialised, the latter are served before the bonds 
ranking second or third and therefore have to bear fewer losses. 

Regarding the coherence between absolute and relative seniority and LGD, it 
must be recognised that besides the creditworthiness of the bond, the seniority also 
plays a role for the determination of LGD. The fact that in addition to the absolute 
seniority, relative seniority also influences LGD is an interesting result. This 
coherence is also detected in the models of Gupton and Stein (2002, 2005). 


‘Tn general, all interpretations according to the quoted model refer to the transformed LGD Yui. 
As yx is the result of a strictly monotonic transformation of LGD all interpretations hold as well 
for LGD. 


?°Hamilton and Carty (1999). 
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If in addition to the collateralisation by the direct issuer, the bond is protected by 
a third party, these bonds have a significantly lower LGD than bonds without this 
additional backing. These additional providers of collateral could fill in for the 
defaulted company if the latter does not have a substantial value. Therefore, it can 
reduce the loss of these bond creditors. 

Another impact on LGD is given by the maturity of the bond. A longer maturity 
leads to higher LGDs. This result can possibly be explained by the fact that future 
payments are insecure. The recovery rate and LGD are calculated as the market 
price 1 month after default. If maturity is longer, higher cash flows are achieved in 
the future which are generally more insecure. This is reflected in lower recovery 
rates and higher LGDs. Gupton and Stein (2005) negate the influence of maturity on 
LGD in their recent paper. In their opinion the maturity does not play a role for 
defaulted bonds. Only the risk horizon matters, which is 1 year in their analysis. 
However, Gupton and Stein (2005) neglect the uncertainty of future cash flows. 

Additionally, the volume of the defaulted bonds influences LGD as a factor of 
the supply side. As mentioned above, a higher volume of defaulted bonds leads to a 
higher supply and to lower prices for these bonds, i.e. to lower recovery rates and 
higher LGDs.”! 

The incorporation of macroeconomic factors in model II tries to explain the 
cyclical variations of LGD. These factors can be interpreted as follows: The 
average default rate of the bond market (in percent) with a time lag of 1 year is 
taken into account in the model as a possible proxy for the cyclical influence. An 
increasing lagged average default rate leads to significantly higher LGDs. This 
result is supported by Altman et al. (2003) who detected a positive relationship 
between the default rate and the (average) LGD as well. 

The cyclical variation in LGD (see Fig. 8.1), can be explained by the fact that 
more borrowers and therefore more bonds are defaulting during a recession. More 
companies and collaterals have to be commercialised leading on the one hand to a 
greater supply of collateral and therefore lower collateral prices. On the other hand, 
the demand for these commercialised collaterals declines because the non-defaulted 
companies are not able to invest the same amount of money during a recession as 
during an expansion. Macroeconomic variables like the lagged default rate try to 
explain these cyclical variations. 

Apart from the models described above, several other models were tested: 
A potential variable is the difference between issuer and debt rating in the year 
before default.” If the issuer rating is better than the debt rating, the LGD of this 
bond is expected to be smaller than the LGD of bonds with an issuer rating equal to 
or worse than the debt rating. Because issuers with an issuer rating better than the 
debt rating dispose of a higher borrower’s creditworthiness, we can expect that 


?1 Altman et al. (2003) also detected a relationship between the average LGD per year and the 
volume of defaulted bonds. 


??For example the issuer rating could be “Aaa” and the debt rating “A”. 
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there is an additional protection of the bond by the issuer. However, this variable 
did not influence LGD significantly. 

Alongside, the interactions between absolute and relative seniority were tested. 
As they are only partially significant they are not included in the model. The 
interactions between issuer rating and absolute and relative seniority were included 
as well but do not show a significant influence on LGD. 

Additionally, a finer sectoral classification is tested to distinguish the impact of 
several sectors. This finer classification does not have sufficient observations for all 
sectors so a model with this fine classification cannot be estimated. 

Moreover, other macroeconomic factors are integrated in the model. They 
comprise the GDP (gross domestic product) growth and the “index of leading 
indicators” which are included in the models contemporarily and with a time lag 
of 1 or 2 years. Furthermore, several macroeconomic variables such as the unem- 
ployment rate, the consumer sentiment index, the yield of the consumer sentiment 
index and different interest rates are tested with several lags. The average LGD per 
year is included with a time lag of 1 year in the model. These variables do not affect 
LGD significantly when the default rate 1 year before default is also included in the 
models. Altman et al. (2001, 2003) receive similar results. They conclude that 
fundamental macroeconomic variables do not have a significant influence on the 
average LGD in a multivariate context if the model contains the default rate. 

The variance of the error term b)? is 1.9266 if a model without explanatory 
variables is used. Only the constant term reflecting the average level of the 
transformed value of LGD is taken into account in this model. In models I and II 
the variance of the error term declines slightly to about 1.7336 and 1.7327, 
respectively. This can be attributed to the improved estimation of LGD including 
issuer specific and macroeconomic variables and thus to a decreasing prediction 
risk. 

In model II, the variance of the random time effect b? decreases because 
appropriate macroeconomic factors have been integrated compared to model I. 
This result indicates that the integration of the default rate leads to a decrease in the 
variance of the random effect. 

Taking (8.1) into account, the variance of the transformed LGD o° and the 
correlation œ for two different borrowers in the same year are examined. A standard 
deviation 6 of 1.4883 for a model without explanatory variables, 1.4407 for model I 
and 1.4208 for model II is obtained.” The correlation @ between the predicted 
LGDs for next year is 16.48% in model I. It declines to 14.16% in model II because 
of the effect of systematic economic risk factors.” 

Finally, it should be mentioned that the variance estimates 6? for models I and II 
are still high. This result indicates that there may be further important issuer specific 
variables which explain the variation of LGD. Examples are balance sheet variables 
not available in Moody’s dataset. 


3 
Bo = bt + b. 
“o = bjo. 
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8.4 Conclusions 


In most empirical analyses concerning LGD, the distribution of LGD is implied to 
be constant and LGD is generally estimated using historical averages. Therefore, 
the individual values of LGD of issuers within a certain time period as well as the 
values of LGD over time should deviate only randomly from a certain mean. Such 
an assumption seems to be unrealistic given the fact that in times of a recession, not 
only the creditworthiness of the borrowers declines and PDs rise, but that also LGD 
is systematically higher. 

In this chapter a dynamic approach which generalises other approaches is 
presented. LGD is modelled depending on issuer and bond specific as well as 
macroecomic factors. As the variables are lagged, the LGDs for the next year can 
be predicted on the basis of values that are known at the time the prediction is made. 

Reduced uncertainty in the prediction of LGD is important for the determination 
of LGD, not only for Basel II but also for internal risk management using credit 
portfolio models. At a given state of the economy, more precise predictions about 
the economic capital can be made than using historical averages. Furthermore, in a 
credit portfolio model, the prediction uncertainty can be taken into account at the 
simulation of the predicted loss distribution, e.g. resulting from the estimation of 
the parameters B and y. 

In a next step, further bond specific performance figures that could not be 
reproduced in the dataset at hand will be analysed. This could lead to a further 
reduction of prediction uncertainty, which is relatively high in comparison to PD 
predictions. If banks have a database which is large enough to estimate individual 
LGDs, the model presented in this chapter can be used. Although there may be other 
factors influencing LGD in a bank, e.g. type of collateral (financial collaterals, real 
estate etc.), the LGD can be estimated individually using an econometric approach. 
The “point-in-time” predictions of LGD can also be used to predict downturn LGDs 
demanded by Basel II using downturn states of the macroeconomic variables. At 
present there are relatively few studies for the determination of recovery rates and 
LGD on the basis of individual data. Moreover, the availability of data is restricted. 
Therefore, further research is necessary in this area. 


Appendix: Macroeconomic Variables 


Interest Rate Fed Fund — monthly 

Interest Rate Treasuries, constant maturity 6 months, nominal, monthly 
Interest Rate Treasuries, constant maturity 1 year, nominal, monthly 
Interest Rate Treasuries, constant maturity 5 years, nominal, monthly 
Interest Rate Treasuries, constant maturity 7 years, nominal, monthly 
Interest Rate Treasuries, constant maturity 10 years, nominal, monthly 
Interest Rate Conventional mortgages, fixed rate — monthly 


(continued) 
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Commercial bank interest rates, 48-month new car, quarterly 
Commercial bank interest rates, 24 months personal, quarterly 
Commercial bank interest rates, all credit card accounts, quarterly 
Commercial bank interest rates, Credit card accounts, assessed interest 
Interest Rate, new car loans at auto finance companies, monthly 
Interest Rate, bank prime loan, monthly 

Civilian Labour Force Level 

Employment Level 

Unemployment Level 

Unemployment rate 

Initial Claims for Unemployment Insurance 

Challenger Report, Announced Layoffs 

Mass Layoffs 


Manufacturing Data: 

Shipments Total Manufacturing 
New Orders Total Manufacturing 
Unfilled Orders Total Manufacturing 
Inventory Total Manufacturing 
Inventory to shipments Total Manufacturing 
Capacity Utilization total 

Business Bankruptcy Filings 
Non-business Bankruptcy Filings 
Total Bankruptcy Filings 

Dow Jones Industrial Index 

S&PS500 

NASDAQ100 


Price Indices: 

GDP Implicit Price Deflator (2000 = 100) 

Consumer Price Index, All Urban Consumers; U.S. city average, all items 
Producer Price Index; U.S. city average, Finished Goods 

Gross Domestic Product 

Gross Private Domestic Investment 

Percent Change From Preceding Period in Real Gross Domestic Product 
Public Debt 

Tax Revenues 

Uni Michigan Consumer Sentiment Index 

PMI (Purchase Manager Index, Institute for Supply Management) 

Retail Sales total (excl. Food Services) 

Revised Estimated Monthly Sales of Merchant Wholesalers 

Business Cycle Indicator: Index of Leading Indicators (The Conference Board) 
Average crude oil import costs (US$/barrel) 

Average default rate of issuers at the bond market 
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Chapter 9 
Estimating Loss Given Default: Experience 
from Banking Practice 


Christian Peter 


9.1 Introduction 


Modern credit risk measurement and management systems depend to a great extend 
on three key risk parameters: probability of default (PD), exposure at default 
(EAD), and loss given default (LGD). PD describes the probability that the lending 
institution will face the default of some obligor or transaction. EAD gives an 
estimate of the exposure outstanding at the time of the default, also indicating the 
maximum loss on the respective credit products.' Finally, LGD measures the credit 
loss a bank is likely to incur due to an obligor default. 

In its advanced internal rating based approach (IRBA), the New Basel Accord 
(Basel II) underpins the importance of these key parameters by allowing financial 
institutions to apply their own estimates for PD, EAD, and LGD in the computation 
of regulatory capital. Since the risk weight of a credit facility is linear in LGD, the 
bank’s ability to appropriately estimate LGDs for its portfolios will directly affect 
the amount of regulatory capital required under Basel II. 

LGD numbers may, however, not only play a significant role in internal credit 
risk management and future regulatory reporting, but may also be used in account- 
ing. For example, a bank may want to apply modified LGDs in its fair value as well 
as impairment computations required for IAS/IFRS.” Despite all these fields of 
application, LGD estimation has gained relatively little attention in the literature.’ 


'This article will use the expressions credit product and (credit) facility interchangeably as generic 
terms for all credit risk bearing instruments of a bank. 

International Accounting Standard/International Financial Reporting Standard. 

3See for example Altman et al. (2005) or the articles available at http://www.defaultrisk.com 
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This article approaches LGD estimation from a perspective gained in banking 
practice, intending to address not only the estimation problem itself but also to 
touch on some aspects of the development process as well as the later application of 
these numbers. By doing so the article rather concentrates on practical aspects of 
the topic than on statistical details. The article is organized as follows: The first 
section discusses the requirements arising from different domains of application for 
LGD estimates. Economic loss and LGD are introduced next. The following section 
presents a short survey of different approaches for LGD estimation. A model for 
workout LGD as well as the design of an LGD model for performing and defaulted 
exposures is discussed in the next three sections. Finally, the article closes with 
some concluding remarks. 


9.2 LGD Estimates in Risk Management 


A bank may apply LGD estimates for different domains of application, which often 
impose different requirements on the definition of the performance number and its 
estimation procedures. Regulatory requirements as defined in BCSB (2004) are 
surveyed in Sect. 9.2.1. Afterwards, Sect. 9.2.2 outlines further requirements which 
may be raised from risk management and accounting perspective. 


9.2.1 Basel II Requirements on LGD Estimates: A Short Survey 


BCBS (2004) defines several requirements on LGD estimates eligible for determin- 
ing regulatory capital. The following provides a short survey: 


e Scope. Application of foundation IRB approach requires LGD estimates for 
retail exposures only (§ 331). The advanced IRB approach also allows banks 
to use their own estimates for corporate, sovereign, and bank exposures (§§ 297 
and 298).° 

¢ Default definition (§§ 452-457). The reference definition of default given in 
BCBS (2004) provides the basis for LGD estimation. When using internal or 
external loss data inconsistent with this definition, appropriate adjustments have 
to be made. 


4See BCBS (2004) for the full text as well as additional rules not mentioned here (for example, 
concerning documentation, stress tests, overrides, etc.). The reader should also take the respective 
regulations of national supervisors into account. 


>For purchased receivables, see §§ 364 and 367. 
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e Loss definition (§ 460). LGD is based on economic loss; see Sect. 9.3 for details. 

e LGD estimates (§§ 468-471). “A bank must estimate an LGD for each facility 
that aims to reflect economic downturn conditions where necessary to capture 
the relevant risks” (downturn LGD). The “long-time, default-weighted average 
of loss rate given default calculated based on the average economic loss of all 
observed defaults [. . .] for that type of facility” provides a lower limit for LGD 
estimates. If existent, cyclical variation has to be taken into account. Any 
significant dependence “between the risk of the borrower and the collateral 
or its provider” as well as the effect of currency mismatches must be consid- 
ered in a conservative manner. “LGD estimates must be grounded in historical 
recoveries and, where applicable, must not solely be based on the collateral’ s 
estimated market value”. An institute must fulfil certain requirements on its 
collateral management processes for all collateral that is recognized in the 
bank’s LGD estimates. 


For defaulted exposures, banks have to determine a best estimate LGD, which is 
based “/.. .] on the current economic circumstances and facility status” , as well as 
a conservative estimate reflecting “/...] the possibility that the bank would have to 
recognize additional, unexpected losses during the recovery period” . 


e Data requirements (§§ 472-473). The data basis should ideally cover at least one 
economic cycle, but must be no shorter than 7 years for sovereign, bank, and 
corporate exposures or 5 years for retail exposures, respectively. 

e Assessing the effect of guarantees and credit derivatives (§§ 480-489). Banks 
are allowed to reflect the effect of guarantees through adjustment of either PD or 
LGD estimates. The respective adjustment criteria must be clearly specified, 
plausible, and appropriate. The bank must adopt the chosen technique in a 
consistent way (both over time and across different types of guarantees). Fur- 
thermore, it must assign a rating to each guarantor, fulfilling all minimum 
requirements defined for borrower ratings. Except for certain types of obligors, 
guarantors, and instruments, the adjustment of PD or LGD is restricted in a way 
such that the risk weight of the guaranteed exposure need not be lower than the 
risk weight of a comparable direct exposure to the guarantor (no recognition of 
double-default effects). There are no restrictions on eligible guarantors. Guar- 
antees must fulfil certain standards (for example, evidenced in writing, non- 
cancellable on the part of the guarantor, etc.) to be eligible. 

e Validation (§§ 500-505). “Banks must have a robust system in place to validate 
the accuracy and consistency of rating systems, processes and all relevant risk 
components”. Comparisons between realized and estimated LGDs must be 
performed regularly (at least annually) to demonstrate that realized LGDs are 
within the expected range. “Banks must also use other quantitative validation 
tools and comparisons with relevant external data sources”. They must demon- 
strate that methods do not vary systematically with the economic cycle. Further- 
more, the bank must define reaction standards for the case that deviations 
between realized and estimated LGDs turn to be significant enough to question 
the validity of the estimates. 
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Fig. 9.1 LGD estimates — data sources and domains of application® 


9.2.2 LGD Estimates in Risk Management and Other Applications 


While Basel II provides the focus of this book, banks may use LGD numbers in 
many applications apart from regulatory reporting. Figure 9.1 depicts some of these 
applications as well as the various connections between them. A bank’s internal 
credit risk reporting and management processes require LGD estimates for different 
purposes: Internal reporting (risk bearing ability, performance measurement, etc.), 
pricing, the bank’s credit approval authority regulations, and limit management 
may be some of these applications. 

Accounting can become another field of application for LGD estimates or 
derivatives of them. When considering IAS/IFRS, LGD figures may enter fair 
value computations and impairment tests. IAS asks banks to disclose fair values 
for financial assets and liabilities at least in the notes of the annual statement.’ 
These numbers can, for example, be computed applying a discounted cash flow 
model, with LGD numbers used to adjust cash flows for credit risk. 

Impairment tests provide further possibilities for connecting accounting and 
credit risk management processes. General provisions can be computed using a 
modified’ LGD number based on the finding that the concepts of incurred loss — as 


NPL is used as an abbreviation for non-performing loan. 
7See IAS 39.8, IASB (2005), for a definition of fair value. 


8As an alternative to cash flow adjustment, one may apply a discount rate adjustment approach. In 
this case, one may refer under certain circumstances to similar risk-adjusted discount rates as used 
for LGD estimation; see Sect. 9.6.2.4. 


°Some of the necessary modifications are addressed below. 
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defined by IAS/IFRS — and expected loss — as used for credit risk measurement — are 
quite similar." Furthermore, best estimate LGDs as required for regulatory purpose 
and specific provisions computed following the rules of IAS/IFRS are both based on 
expectations about future cash flows from a defaulted facility, its collateral, and 
guarantees. Therefore, one may derive both specific provisions and best estimate 
LGDs from the same information base. This will be discussed in more detail in 
Sect. 9.7. 

A great part of the functionality required for these three domains of application, 
ie. regulatory reporting, internal risk reporting and management, as well as 
accounting, is identical. However, there are differences due to diverging intentions — 
stability of the bank in case of Basel II and objective reporting of the bank’s assets in 
case of IAS/IFRS. This may concern the definition of EAD as well as the definition 
of LGD. For example, impairment considers book value as EAD. Fair value 
computations may not take future drawings into account, while these are part of 
Basel II compliant exposure at default. Risk management, on the other hand, may 
recognize future redemption to a larger extent than regulatory requirements allow. 

In addition to the impact of different EAD definitions, the loss definition underly- 
ing LGD may slightly vary with the domain of application. The level of conservatism 
underlying the estimates will be different due to diverging intentions. Definition of 
loss components can differ; for example, internal costs may not be part of IAS 
numbers, while Basel II and internal applications will recognize them. Furthermore, 
one may decide to consider separate LGDs for different credit events, for example, 
political risks in internal models.'' In addition to the 1-year horizon considered in 
Basel II, a bank may be interested (at least for some applications, possibly including 
regulatory capital) in a dynamic, multi-period projection of risk numbers. Another 
potential field of deviations is the assessment of risk mitigation effects. 

Dealing with different definitions of EAD and LGD can cause some confusion in 
internal communication — despite their different domains of application — and 
therefore requires bridging one EAD or LGD number into the other in order to 
explain the differences. Furthermore, the complexity of an LGD engine, which 
takes all these different requirements into account, can be high, also resulting in 
increased costs of development and maintenance. Before stating bank specific 
additional requirements, one should therefore carefully check whether the expected 
gain in explanatory power rectifies the corresponding effort and costs. 


9.3 Definition of Economic Loss and LGD 


Basel II requires measuring economic loss as a basis for LGD estimation. “/...] 
When measuring economic loss, all relevant factors should be taken into account. 
This must include material discount effects and material direct and indirect costs 


‘Due to restricted data availability, differences might be greater in theory than in banking practice. 
"This will be necessary if a bank defines its PD ratings as local currency ratings. 
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associated with collecting on the exposure. [...]” (see BCBS (2004), § 460). The 
directive only mentions basic components of economic loss while leaving the exact 
definition to the banks. 

One may think of economic loss as the change in a facility’s value due to 
default, !? Le. 


EcoLoss;(tpr) = Vi(tor,P) = Vi(tor, np) (9.1) 


with V(tpr,(n)p) describing the value of a (non)performing facility j in tpp, the time 
of default. Following the current discussion, the value of the performing facility, 
Vi(tpr, p), is generally approximated by the amount outstanding at default plus 
eventual further drawings after default, ie. by EAD.'* '* The residual value of 
the defaulted facility, V;(tpr, np), can be expressed as the net present value of all 
recoveries from the exposure diminished by all direct and indirect costs arising 
from default. The LGD of a facility j then follows as the ratio of economic loss to 
exposure at default, i.e. 


EAD (tor) — NPV (Rec;(t),t > tpr) +NPV (Costs;(t),t = tpr) 
EAD;(tpF) 


LGD (tor) = (9.2) 


with NPV(.) the net present value, Rec,t) and Costs,t) all recoveries and costs 
observed at t, respectively. Negative economic loss or LGD indicate a gain. While 
negative LGDs are sometimes observed in practice, LGD estimates are generally 
required to be greater than or equal to zero. This article will refer to realisations of 
LGD as ex-post LGDs, while estimates of loss quotas will also be named ex-ante 
LGDs. 

Recoveries after default result from facility or collateral sale, guarantees, bank- 
rupt’s assets, as well as restructured or cured exposures. Further unexpected sources 
of recoveries may sometimes also be observed. While ex-post LGDs may include 
all types of recoveries received for a defaulted exposure, the reference dataset 
(RDS) for model development should generally not reflect extraordinary recov- 
eries, for example stemming from non-eligible collateral or guarantees, in order to 
avoid distortion. '° 

Material direct and indirect costs arising from the handling of a defaulted 
exposure are, for example, external and internal labour costs, legal costs, costs for 
forced administration, insurance fees, costs for storage, maintenance, repairs of 


Note that differences in default definition will therefore affect economic loss. 


13 As an alternative, one might define V(t, p) as the net present value of all future recoveries and 
costs of the facility in case of no default in t. While theoretically appealing, such a definition can be 
difficult to implement in practice. Furthermore, it would also require a respective definition of 
EAD as might be done in internal models only. 

l4See, for instance, Chaps. 10 and 11 for more details on EAD estimation. 


15 : aa; . . . 
Exemptions may be possible if such extraordinary recoveries are observed on a regular basis. 
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assets, etc. Furthermore, one should include ongoing costs, for example, corporate 
overhead. Refinancing costs resulting from incongruence of cash flows due to 
default may also be considered if material.'° On the other hand, losses of future 
earnings (e.g., interest income) are generally not considered as part of economic 
loss. With respect to (9.1), one may recognize only additional costs, i.e. the 
difference between costs arising from the performing and the defaulted exposure, 
respectively. As mentioned above, economic loss and LGD used for IAS purpose 
should not include internal costs. 

In order to recognize discount effects, all recoveries and costs have to be 
discounted. Since workout processes can be time demanding, the chosen discount 
rate may significantly affect the resulting economic loss and LGD; see Sect. 9.6.2.4. 


9.4 A Short Survey of Different LGD Estimation Methods 


The following provides a short survey of main approaches for LGD estimation 
currently discussed among academia and practitioners. When classifying different 
LGD approaches, a first distinction can be made between subjective and objective 
methods. A bank may have insufficient data to rely solely on quantitative methods. 
This can occur for low default portfolios, new products, or during the introduction 
of LGD methodology. In these situations, the bank may think of subjective methods 
primarily based on expert judgment as a valuable source of information. While 
there seems to be no special literature on subjective methods in LGD estimation, 
techniques known from other fields of application can easily be adopted. Interviews 
with experts from different units of the financial institute, comparisons with similar 
portfolios, or scenario techniques may help to develop an idea of the loss quotas one 
should expect to observe. As far as possible, the bank should incorporate all kinds 
of available loss (related) information into subjective methods. Subjective methods 
may also prove valuable for a validation of the results obtained from applying one 
of the objective methods described next. 

Objective methods can be further classified as being either explicit or implicit, 
depending on the characteristics of the data sources on which they are based. 
Datasets analysed in explicit methods allow for a direct computation of LGDs. 
The so-called market LGD approach, a first explicit method, is applied by compar- 
ing market prices of bonds or marketable loans shortly after default with their par 
values. To compute workout LGDs, it is necessary to discount all recoveries and 
costs observed after default to determine the value of the defaulted facility, which is 
then compared with the defaulted exposure. 


‘SIncongruence can lead to losses or gains depending on the level of interest rates at the time of 
credit granting and default. It is therefore sometimes argued that gains will offset losses due to the 
mean reversion property of the interest rate. 


158 C. Peter 


Different from explicit approaches, implicit methods rely on data sources which 
do not allow for a direct LGD computation but implicitly contain LGD relevant 
information. This information has to be extracted applying appropriate procedures. 
Two approaches which have been discussed in banking practice and in the literature 
are implied market LGD and implied historical LGD method, respectively. 

The idea of the implied market LGD approach is to derive LGD estimates from 
market prices of non-defaulted bonds.'’ The spreads observed for these instruments 
at the market express among other things the loss expectation of the market, which 
may be broken down into PD and LGD. While theoretically appealing, it may be 
difficult to separate adequately the credit risk component of the spread and break it 
down into PD and LGD. 

The computation of implied historical LGDs is described in the Basel II frame- 
work as one approach to determine LGDs for retail portfolios (see BCBS (2004), 
§ 465). This approach involves deriving LGDs from realized losses and an estimate 
of default probabilities. 

Except for implied market LGDs, which may deliver — at least theoretically — 
directly (or with minor modifications) estimates for non-performing facilities, all 
other concepts considered before at first hand deliver ex-post LGDs. The rest of this 
section will consider different approaches for estimating ex-ante LGDs. The main 
interest of a bank is generally to derive estimates for workout LGDs, since these 
best reflect its losses. Ex-post observations of market LGDs may also be used in 
model development; however, doing so may require appropriate adjustments since 
market LGDs include components as risk premiums for unexpected losses, which 
may not be considered in workout LGDs. Furthermore, required components like 
the institute’s specific workout costs are not part of these loss quotas. 

As a first, simple approach, one may consider an ex-ante LGD estimation 
procedure where LGDs are assigned top-down to exposures based on facility 
grades or pool characteristics. Such a procedure requires a segmentation of the 
portfolio under consideration into a small number of, in terms of their loss quota, 
relatively homogeneous groups of facilities. Statistical analysis as well as expert 
judgment provides the basis to identify these segments and to develop the 
necessary assignment rules. Since individual characteristics of facilities can 
only be recognized to a limited extent in such a two-stage approach,'® one will 
expect reasonable performance especially for highly standardized loan programs 
or retail portfolios. 

For portfolios of less standardized facilities, one may presuppose better perfor- 
mance from direct or bottom-up estimation approaches. Higher individual credit 
volumes and smaller portfolio sizes will often be other arguments rectifying the 
development and application of more sophisticated estimation procedures. The 
basic idea of direct estimation techniques is to estimate LGDs based on a model, 


16 available, one may also consider market values of loans or credit derivate instruments. 
'8See CEBS (2005), § 234. 
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which takes individual characteristics of each facility, its collateralization, as 
well as other important risk factors explicitly into account. As for PD prediction, 
empirical statistical or simulation-based models may be applied. 

Simulation approaches are often used for specialized lending transactions where 
the ability of the borrower to fulfil his obligations primarily depends on the cash 
flows generated by the financed object. An individual model of the transaction that 
describes the free cash flows generated by the financed object — and therefore the 
ability to pay interest and principal — as a function of important risk factors provides 
the basis for the simulation. By simulating different scenarios of the transaction’s 
progress, an institute will be able to derive estimates for PD, EAD, and LGD. 
While such approaches provide great flexibility, costs for modelling a specific 
transaction and performing the simulation can be high, depending on the structure 
of the simulation tool. 

LGD estimates based on empirical statistical models can be generated by 
applying a single equation or a component-based approach. While the first approach 
intends to describe LGD by a single (for example regression) model, the latter one 
consists of a set of submodels each describing a certain component of LGD, e.g. the 
recovery rate for a certain collateral type or costs of certain workout activities. LGD 
estimates are then generated by appropriately aggregating the results of the esti- 
mates for these components. Statistical models for LGD or single LGD components 
can also be used in simulations. 

Banks may apply different techniques depending on the characteristics of the 
respective portfolio segment, its importance with respect to the whole portfolio, 
and the availability of loss data. This allows on the one hand measuring LGDs for 
different products with customized estimation procedures. On the other hand, how- 
ever, it can make a consistent measurement of credit risk over the whole portfolio 
more difficult. 


9.5 A Model for Workout LGD 


Consider the situation that a bank faces after a borrower’s default. While default 
itself marks a unique reference point for loss measurement, the workout of a 
defaulted credit facility as well as the resulting loss can vary substantially. How- 
ever, one will probably observe a certain pattern of typical developments, called 
after-default scenarios in this paper. Table 9.1 provides a reasonable set of such 
scenarios. Depending on the banks portfolio as well as its workout strategy, the 
number and definition of after-default scenarios may slightly differ. 

While the loss observed within a certain scenario may be similar for different 
(comparable) facilities, it will generally be impossible to know the after-default 
scenario in advance. One may therefore consider the loss quota of a facility j, 
LGD,, as a random variable following a mixture distribution. With SC; a discrete- 
valued random variable describing the occurrence of after-default scenarios and 
LGD((sc;) a second, continuous-valued random variable describing the loss of 
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Table 9.1 A set of possible after-default scenarios 


Scenario sc; Definition and explanation" 


Cure The defaulted entity cures after a short time and continues to fulfil its 
contractual obligations. 
No significant losses; no changes in the structure or conditions of the credit 
facilities. 
Restructuring The defaulted entity recovers after a restructuring of its facilities. Repossession 
and sale of collateral may sometimes be part of the restructuring. 
Loss amount may vary; customer relationship maintained. 
Liquidation All credit products of the defaulted entity are liquidated, i.e. sale of loans, 
collateral (if available), etc. 
Loss amount generally higher than observed for restructuring; end of customer 
relationship. 


“Scenarios will generally be defined with respect to the defaulted entity (i.e. for borrower or 
guarantors) and may therefore not always correspond with what is observed for a single credit 
product 


a facility depending on the scenario sc; and 6(.) the indicator function, '? LGD j can 
be defined as”? 


LGD; = S © bse, (SC;) - LGD;(sc;) (9.3) 


Collateral and guarantees will generally have a strong impact on the loss quota 
realized for a defaulted facility. Consider a facility, which is secured by n > 1 risk 
mitigation instruments.*' Each of these instruments k collateralizes sq, percent of 
the exposure. One can now break down the exposure into m < n subexposures, each 
collateralized by at least one instrument and an additional part, sqọ, which remains 
unsecured. The percentage of loss realized on each subexposure sq), 0 < I < m, 
may depend on the respective risk mitigation instrument as well as the after- 
default scenario currently under consideration. The total loss quota in scenario sc; 
is therefore given by 


LGD (sci) = 5 sqjı * LGDj,)(sci) (9.4) 


0</l<m 


where LGD; (sc;) describes the percentage of loss observed on an (un)secured 
subexposure of size sq; ;. Since the breakdown in (9.4) is equivalently performed 
for each of the after-default scenarios, one may alternatively write 


Te. ds(SC) = 1 for SC = sc and 6,.(SC) = 0 otherwise. 

°To simplify the presentation, time references are left out in (9.3) as well as in most of the 
formulas following. It is generally assumed in this article that one intends to predict the loss quota 
for a default occurring within a time interval T = [t,, te) given the information up to tọ (the time 
where the computation takes place), i.e. LGD; = LGD(T/to). 

?!This article uses the expression “risk mitigation instrument“ (rmi) as a general notion for all kind 
of collateral and guarantees. 
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LGD) = X` sqjı: LGD;; (9.5) 
0</l<m 
with 
LGD}; = J òs; (SC)  LGDj,(sci) (9.6) 


With respect to (9.2), LGD; (sc;) can be expressed as 
LGDj, (sci) = max { 0; 1 — RR, (sci) + Costs;,(sci) }, (9.7) 


with RR; (sc;) and Costs; (sc;) the percentage of recovery and costs on exposure sq, 
of facility j in scenario sc;. 

Equation (9.5) follows the structure of the formula provided in BCBS (2004) for 
risk mitigation. The extension of considering after-default scenarios may prove 
helpful as a theoretical model as well as for analysing the characteristics of 
observed economic loss or model development. The relatively simple structure of 
the model, which demonstrates the main idea while hiding most of the complexity 
of the underlying statistical models, will also be easy to communicate within the 
bank. This may increase acceptance of the estimation procedures, which may 
appear as a black box for credit analysts. Ex-ante estimates, however, are often 
generated based on a reduced form of the model presented here. 


9.6 Direct Estimation Approaches for LGD 


The following considers direct estimation approaches for LGD. Setting up such a 
procedure requires a description of the components of economic loss, i.e. recoveries 
on secured and unsecured exposures as well as costs, in terms of appropriate 
explanatory variables with respect to the requirements imposed by different 
domains of application, i.e. Basel II, IAS, or internal risk management, respec- 
tively. The development process for an LGD estimation procedure can generally be 
structured along the following steps: 


1. Data collection, pre-processing and analysis 
2. Model design and estimation 
3. Model validation 


Some steps of the development process may have to be repeated several times 
before a satisfactory solution is found. Figure 9.2 depicts a typical series of projects a 
bank may set up in order to develop an LGD engine. The implementation of a credit 
loss database is often the first step. It creates the basis for a systematic collection 
of loss data required for model development. The respective project generally 
incorporates (or is followed by) activities to transfer (a part of) the bank’s loss history 
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Fig. 9.2 Typical structure of an initial phase to set up an LGD engine and the following annual 
validation process 


from paper files to the database. The LGD estimation model as well as the required 
validation procedures and processes can be developed afterwards. 

Following the initial project phase, the LGD engine will be subject to regular 
enhancement and maintenance activities. Such activities may be triggered, for 
example, through the introduction of new products or regulatory changes as well 
as the results of the annual validation. 

The following concentrates on the first two steps of the development process for 
an LGD engine. The presentation starts with a short discussion of some aspects of 
data collection, mainly through a description of typical elements of a credit loss 
database. Afterwards, different aspects of model development are discussed. 


9.6.1 Collecting Loss Data: The Credit Loss Database 


One will generally consider the bank’s own past loss experience as the most 
valuable information available for the development of an LGD estimation proce- 
dure, since it directly reflects the characteristics of the institute’s credit products and 
processes (e.g., origination, monitoring, and workout processes). Banks therefore 
often set up a credit loss database in order to collect all relevant information 
concerning defaulted entities and their credit exposures. 

The aggregate of all information concerning a defaulted entity and its exposures 
is often called a loss file. A loss file will generally include not only information 
about the time after the occurrence of a default but also information about the time 
before. Information about the time after default occurrence consists of 
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e Possible further drawings after default 

e All recoveries related to the defaulted entity, its credit facilities, and risk 
mitigation instruments 

e All costs arising from the workout process 

e Additional information about the workout process (for example, events and 
remarks as well as identifiers of restructured facilities and repossessed assets, 
which later allows to identify these objects within the bank’s IT-systems) 


Further information collected within the credit loss database includes cash flows 
before default (or exposure at the time of default), master data, rating history, 
collateral values, etc. The later model development and estimation process gener- 
ally requires additional information, for example, time series of macroeconomic 
variables or version numbers of the applied risk measurement tools (ratings tools, 
collateral valuation tools, etc.), which may also be incorporated in the database. 

It will often take some time to realize all cash flows from cured or restructured 
credit facilities as well as from repossessed assets. Since workout usually ends 
much earlier and credit products or assets are then transferred from the workout unit 
to another unit within the bank or an external service provider, loss files will often 
be closed by the end of the respective workout activities. Cured or restructured 
credit facilities as well as repossessed assets are valued by that time and the result 
stored as non-cash recovery in the loss file.” 

Since the number of loss observations is often small and loss data coming from 
the latest defaults also contains the most up-to-date information about current loss 
quotas, it appears attractive to include incomplete loss files as early as possible 
in the reference dataset for model development. The decision as to whether an 
incomplete loss file should be incorporated in the reference dataset will generally be 
made on a case-by-case basis and can also depend on the application. A reasonable 
decision criterion may often be defined based on the uncertainty still inherent in the 
value of economic loss due to the incompleteness of the loss case. Often, the end of 
the workout process is a reasonable time to include a loss file into the reference 
dataset. A component-based estimation approach may provide possibilities for even 
earlier usage of incomplete loss data; for example, by considering incomplete 
loss files in the reference dataset of some LGD components only.” While the use 
of incomplete loss files will make loss data available more quickly, this data, still 
incorporating estimates, can only be used to a limited extent, which may limit the 
benefit. 


However, as mentioned above one should include references into the loss file in order to allow 
for a later replacement of non-cash recoveries by the corresponding cash recoveries realized from 
the respective cured or restructured facilities. Note that non-cash recoveries are generally esti- 
mates of future, uncertain cash flows. 

3For example, repossession and sale of collateral might already be finished for a defaulted credit 
product. The respective information can then be used to update the estimate of the recovery rate for 
the respective collateral type(s) while at the same time the information required to re-estimate the 
recovery rate for unsecured exposure might still be incomplete. 


164 C. Peter 


A number of further aspects should be considered during data collection and pre- 


processing; the following outlines a few of them: 


Most of the requisite data can generally be found in existing IT-systems, allow- 
ing for the automatic collection of loss data. However, manual inputs are 
probably necessary during the workout process. These will include most infor- 
mation about the workout process, i.e. events, remarks, etc. While remarks allow 
entering information in an unstructured way, events provide the possibility of 
marking specified states and decisions, milestones, or turning points in order to 
structure the workout process for later analysis.” The extent to which such data 
must be added should be specified carefully in order to get informative loss files 
without causing too much extra work and costs. 

Since estimation procedures will improve over time, it will often be beneficial to 
collect a superset of the loss data currently required for model estimation. The 
degree of detail may be different depending on the business line or credit 
product. This may, for example, result in more detailed loss data collection for 
large corporate than for retail exposures. 

Assuring the quality of loss data can be more time-consuming than expected at 
first glance. Simple automatic consistency checks might help to detect irregula- 
rities in the data; however, a larger part of the checks requires a deeper under- 
standing of the workout processes as well as the loss cases themselves and 
therefore has to be done in collaboration with experts from restructuring and 
workout units. 


9.6.2 Model Design and Estimation 


The general structure of an LGD estimation procedure often consists of the follow- 
ing three steps: 


1. 


2. 


Data collection. Identification and collection of all data required to estimate 
LGD. 

Pre-processing. Transformation of raw data into a form suitable for the estima- 
tion of LGD or LGD-related numbers. This may already include estimates for 
single LGD components. 


. Generating estimates. Generation of LGD estimates by appropriately assem- 


bling the results of pre-processing. In particular, this includes recognizing the 
risk mitigation effect of guarantees and collateral. As a by-product, the proce- 
dure may also provide other useful information, as for example statistics on the 
concentration in risk mitigation instruments, etc. 


Figure 9.3 shows the basic structure of an LGD engine as it might be imple- 


mented within a bank’s IT-systems. Depending on the IT-infrastructure, institutes 


?4 An example of how this information may be used in LGD estimation is given in Sect. 9.6.2.3. 
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Fig. 9.3 Structure of an LGD engine 


may run various engines for different applications or portfolio segments or refer to a 
central engine as depicted here. In the later case, a controller may organize the 
computation of LGD estimates for different applications. 

Regression-type models are generally preferred as a flexible approach for mod- 
elling LGD or its components. Such approaches have been considered in several 
publications on LGD estimation; see for example Altman et al. (2003) or Chap. 8. 

Banks will often “suffer”, at least during the initial years after introducing LGD 
estimation procedures, from an insufficient number of loss observations at least for 
certain parts of their portfolio. The need to rely on information from various 
sources, sometimes following different definitions of default and loss, and also 
having different quality characteristics, can make other, more “simple”?> approaches 
attractive. Capacity as well as time restrictions or priority settings among different 
portfolio segments are additional reasons why banks may start with these 
approaches for some portfolios. 

Lookup-table based approaches will often provide the basis for LGD estimation 
procedures in such situations. The idea here is to tabulate possible values of some 
variable of the model, for example, a recovery or cost rate, or the resulting LGD 
numbers themselves, together with the respective selection criteria. For instance, a 
bank may tabulate recovery rates for unsecured exposures depending on customer 
type, facility type, seniority, and region (see also Table 9.2). Given such a table, the 


While being simple from a pure statistical point of view, setting up a procedure that generates 
reasonable LGD predictions based on different types of information will nevertheless often remain 
a demanding task. 
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bank can easily generate an estimate for the recovery rate of some exposure by 
reading the recovery value corresponding to these four characteristics. The devel- 
opment of such a table requires first the identification and description of segments 
of similar values for the considered variable in terms of appropriate explanatory 
variables. Afterwards, a representative value for the variable under consideration 
has to be estimated for each segment. Both steps can be supported by expert 
judgment or other external information sources if the bank’s reference dataset is 
insufficient. 

One should expect such models to capture only a part of the (explicable) vari- 
ability of LGD numbers observed in practice. For example, it will be difficult to 
describe the dynamics with respect to changes in macroeconomic variables. This 
can result in higher margins of conservatism and therefore rather conservative LGD 
estimates. On the other hand, lookup-table based approaches are more intuitively 
understandable, thus supporting internal communication and acceptance within the 
bank, which can be advantageous especially during the introduction phase. They 
may therefore serve as a starting point for some portfolio segments when introdu- 
cing an LGD estimation procedure. It is then a matter of further developments to 
successively replace lookup-tables with more sophisticated statistical models wher- 
ever sufficient loss data can be made available and one expects significant improve- 
ments in the quality of LGD estimates. However, designing an LGD engine in a 
way that easily supports the migration from a simple to a more sophisticated 
estimation procedure at a later point in time can be complicated and may lead to 
increased follow-up costs. 

The following sections consider some aspects of the design of an LGD estima- 
tion procedure. The first section considers basic explanatory variables for LGD. 
Afterwards, approaches to estimate the two main components of LGD, recoveries 
and costs, are described. The choice of appropriate discount rates is considered 
next. A last section concludes this part with a short discussion on how the Basel II 
requirements concerning the conservatism of LGD estimates can be recognized. 
It is beyond the scope of this article to describe the whole development procedure in 
detail; the following will therefore skip many technical details which may be found 
in most statistical textbook. 


9.6.2.1 Possible Explanatory Variables for LGD Estimation 


To identify appropriate explanatory variables, also named risk factors or risk 
drivers, one may start with a list of possible risk factors resulting from expert 
judgment, which are then tested during model development for their individual and 
joint explanatory power. In practice, the limited number of loss observations will 
sometimes make a statistical analysis difficult or even impossible, and may there- 
fore restrict the set of risk drivers that can be considered in an LGD model. 

Table 9.2 summarizes some possible explanatory variables generally considered 
as possible risk drivers when developing LGD estimation procedures. Most of them 
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Table 9.2 Examples of possible explanatory variables grouped by categories 


Category Explanatory variables 


Borrower Customer type (sovereign, private entity, SME, corporate, ...), 
country or region, industry, legal structure and capital structure 
of the entity, rating, etc. 

Credit facility Seniority (senior, junior, ...), debt type (loan, bond, ...), 
transaction type (syndicated loan, ...) and number of financing 
entities, exposure, financing purpose, degree of standardization, 
collateralization (LTV... .), etc. 

Collateral Type, current book or market value, value depreciation, age, 
mobility (immobile, national or international mobile), 
producer, technical characteristics (for example, engine type of 
an airplane or gauge of a locomotive), etc. 

Guarantee Guarantor (see list of explanatory variables required for borrowers 
as provided above), coverage, warranty clauses, etc. 

Macroeconomic and other GDP growth rate, unemployment rates, interest rates, FX rates, 

external factors price indices, legal system and institutions, etc. 

Bank internal factors Versions of valuation procedures and tools, workout strategy, 
collateralization strategy, etc. 


can easily be justified by intuition.” Furthermore, one will expect some of these 
variables to have explanatory power not only for (single components of) LGD 
but also for PD and EAD, indicating dependences between these key parameters of 
credit risk. 

Borrower and credit product characteristics as for example industry, capital 
structure, and seniority may explain recovery rates on unsecured exposures in 
liquidation scenarios (i.e. from bankrupt’s assets). They may also indicate workout 
intensity as a proxy for workout costs. Depending on the regional distribution of 
the portfolio, it could be necessary to consider region or country as explanatory 
variables.’ 

Recoveries from collateral will depend on the possibility of repossessing and 
selling the respective assets. Depending on the market size and structure observed 
for a certain asset class, the bank may have to accept discounts for distressed sale. 
Technical characteristics of the respective assets could serve as an indicator for the 
level of such discounts and may also explain in part the costs of sale. Analogously, 
the value of a guarantee depends on the credit standing of the respective guarantor 
as well as on specific warranty clauses. In case the guarantor defaults, recoveries 
can be expected to depend to a large degree on the same explanatory variables as 
mentioned above for unsecured exposures (i.e. borrower characteristics). 


2° comprehensive survey of empirical analyses can be found in Bennett et al. (2005); the 
following mentions only a few of them. 

27 Altman and Kishore (1996) and Acharya et al. (2004) found significant differences in recoveries 
of defaulted bonds belonging to different seniority classes. The same authors report significant 
differences for only some industry sectors, while Araten et al. (2004) could not find significant 
impact of industry (or region) on LGDs observed for loans. 
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The macroeconomic situation at default will generally influence LGD, as was 
demonstrated by several authors.** Basel II explicitly asks to take economic cycles 
into consideration. Depending on the regional distribution of the institution’s 
portfolio and the considered recovery source (e.g., a certain asset type), one may 
consider different economic variables. Since default and recoveries from bank- 
rupt’s assets and collateral may both depend on the same macroeconomic variables, 
an appropriate recognition of these dependences will be important to avoid over- 
estimating recoveries.”” Other external factors as jurisdiction and legal system can 
also play a role when explaining lengths and costs of workout activities as well as 
amount of recoveries.*° 

As a last group of explanatory variables for LGDs, one should consider bank 
internal characteristics. Loss experience as well as LGD estimates will reflect to 
a certain degree characteristics of the bank’s internal processes (e.g., origination, 
monitoring, and workout processes). For instance, a bank’s workout strategy has 
a strong impact on the magnitude of recoveries and costs. Therefore, any change in 
the strategy may require modifications in the LGD estimation procedures in order to 
recalibrate them. For example, the a modification of a collateral valuation proce- 
dure may require a transformation of historical valuations and adjustments in 
estimated recovery rates for the respective asset type as well as modifications of 
the LGD estimation procedure.*! 


9.6.2.2 Estimating Recoveries 


Recoveries are generally the main driver of LGD. With respect to (9.3)-(9.7), one 
may define recovery rates as 


NPV (CFj,(sci)) 


MSG) = a, HAD: 
J: J 


(9.8) 


for (un)secured exposures of size sq;,;- EAD; observed for a loss case** j in the 
respective after-default scenario sc;. sq; , may be defined in different ways as will be 


8 Araten et al. (2004) report correlation of unsecured exposures (but not of secured exposures) 
with economic cycle. Several authors report dependences found in bond data, see for example 
Hamilton et al. (2006) or Altman et al. (2003). 

?°Several authors have analysed the link between default and LGD; see for example Frye (2000a, b), 
Altman et al. (2003), and Diillmann and Trapp (2004). 

3°See for example Franks et al. (2004) for an analysis of recovery processes and rates in the U.K., 
France, and Germany. Useful information about doing business in different countries may also be 
found at http://www.doingbusiness.org. 

3!See Sect. 9.6.2.2 for more details. The example also demonstrates why the version number of a 
collateral valuation tool may be important information within the credit loss database; see 
Sect. 9.6.1. 


32A loss case will generally comprise all credit products of a defaulted entity. 
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Table 9.3 Recovery sources with respect to different after-default scenarios 


Scenario sc; Unsecured exposure Secured exposure 
Cure Recovery = cured facility 
A 
Restructuring Recovery = restructured facility 
(I) no usage of rmi B 
(ID usage of rmi Recovery = Recovery from eligible 
restructured facility [C collateral or guarantee [D 
Liquidation Recovery from Recovery from eligible 
bankrupt’s assets E collateral or guarantee [F 


discussed below. NPV(CF) again denotes the net present value of all cash flows 
which are observed on the respective exposure. Assume for this section that 
recovery rates are determined without taking costs into consideration. 

Equation (9.8) can be used to generate lookup-tables based on historical loss 
information or may also be computed as part of the estimation procedure. The later 
case may be attractive if the bank plans to consider the discount rate as an input 
parameter of the estimation procedure.” 

Table 9.3 summarizes recovery sources for the after-default scenarios shown in 
Table 9.1. Basel II requires assigning LGDs to each facility (see Sect. 9.2.1); 
however, in practice, recoveries can be observed on different, often more aggre- 
gated levels. These are generally credit entities (i.e. borrower and guarantors), 
facilities, and risk mitigation instruments.** For example, two loans of a defaulted 
obligor may be collateralized by the same asset. In this case, the distribution of the 
asset’s sales proceeds onto the loans is often ambiguous. Ex-post LGD computation 
as well as ex-ante estimation on a loan level therefore require appropriate proce- 
dures to allocate recoveries to facilities. 

Since guarantees require, at least under Basel II, a slightly different treatment, 
the following considers first exposures, which are either unsecured or secured by 
collateral. Afterwards, the risk mitigation effect of guarantees is considered in 
a separate subsection. A concluding third section outlines additional aspects of 
recovery rate estimation. 


Unsecured Exposures and Exposures Secured by Collateral 
Consider an exposure collateralized by an asset A; having a reference value V,(A;) at 


time t. For ex-ante estimation, the reference value will later generally be the result of 
the most recent valuation of the asset. Ex-post, one may use either the last valuation 


Details of this approach are considered in the next section for collateral recoveries. 
34The same holds true for other components of LGD, see for example Sect. 9.6.2.3. 
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Table 9.4 Two approaches for estimating recovery rates of (un)secured exposures 


Approach 1 Approach 2 
Secured exposure Estimate RR, of asset type Estimate RR, of asset type k based on 
k based on recoveries recoveries of |D] and [Ff 
of [A], [B], [D], and [Ff 
Unsecured exposure Estimate RRọo based on Estimate RRo based on recoveries of |C], 
recoveries of [A], [B], and [E] 
C], and [EF 
Total exposure - Estimate RRo based on recoveries of [A 
and [BF 


Estimate Pg and Pag, the probabilities 
of after-default scenarios without or with 
usage of risk mitigation instruments" 


“All references with respect to Table 9.3 


before default or — if available — a valuation performed after default.*° Given these 
information, the recovery rate RR, for a certain collateral type k can be estimated 
ex-post as the ratio of the net present values NPV(CF) of all sold assets of type k to 
the respective collateral valuations V(A) before default. 

Now assume that asset A; is of collateral type k(i) and that loss experience 
indicates a recovery rate RR,:; for this collateral type or for an exposure collater- 
alized by it, respectively. The bank would then expect to realise a recovery of 
VAi): RRxy for the respective reference object, i.e. for a secured exposure or the 
asset itself. The reference size in (9.8) is given by sqj)= min{1; V(A;)/EAD;}. 
Alternatively, one may define the respective reference size in (9.8) with res- 
pect to the recovery of asset Aj, i.e. sq; = min{l; V,Aj) - RRka/EAD;}. The 
recovery on subexposure sq; - EAD; will then be 100%. In both cases, one may 
proceed similarly for unsecured exposures, considering the respective exposure size 
Sqo = max{0; 1— >) > 1 sqj,} as the “asset” value. 

The general model described in (9.3)—(9.7) defines LGD as the weighted sum of 
LGDs observed in different after-default scenarios on a set of subexposures. Esti- 
mation of ex-ante LGDs may follow this line, i.e. first estimate LGD; ;(sc;) for each 
subexposure in each scenario and afterwards aggregate these numbers to determine 
the LGD estimate for the exposure under consideration. However, one may want to 
simplify the procedure by aggregating as many of these LGD, /(sc;) estimations as 
possible in order to lower computational complexity. Table 9.4 demonstrates two 
possible approaches to do so. 

The idea of the first approach is to estimate LGDs for a subexposure without taking 
explicitly into account after-default scenarios. However, recoveries on secured 
exposures may not only depend on collateral but also on facility and borrower 
characteristics. In principle, this problem can be overcome by partitioning the pool 
into homogeneous groups of obligors and estimating parameters for each partition 


3°In order to estimate PD, EAD, and LGD in a consistent way, one will often apply a cohort 
approach for all three variables. Therefore the last valuation before default is the more appropriate 
reference value. 
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separately. A limited number of loss observations often hinder a partitioning in 
practice. The approach therefore appears especially appealing for large, homoge- 
neous portfolios. 

The second approach disaggregates recoveries which depend on asset charac- 
teristics and those which do not. In fact, it can be considered as a generalisation of 
the first approach. Mainly collateral-independent recoveries of the after-default 
scenarios “cure” and “restructuring without rmi usage” are estimated for the whole 
exposure while recoveries in scenarios with rmi usage are estimated separately for 
secured and unsecured subexposures (as was the case in variant 1). The complexity 
of the approach is therefore only slightly higher; however, one has to estimate more 
parameters. 

Instead of modelling different components for (un)secured exposures and/or 
different after-default scenarios, one may also try to describe total recoveries on 
an exposure by a single recovery component. This might be done, for example, 
by considering the sum of expected asset recoveries as an explanatory variable. 
If exposures are secured by only one asset, as will often be the case, one may 
also try to incorporate asset values directly as explanatory variables into a 
recovery model. Since recovery rates generally depend on the respective asset 
type, such models will probably require considering asset type as an additional 
explanatory variable. Furthermore, one may face the same problems as already 
discussed above. 

Explicit consideration of after-default scenarios following the approach outlined 
in Sect. 9.5 and discussed in more detail above may be applied in loss data analyses 
as well as LGD estimation for defaulted exposures (see Sect. 9.7). Furthermore, 
explicit consideration of scenarios can sometimes be useful when combining 
different internal and external data sources or when loss data is missing for some 
parts of the bank’s portfolio. Incorporating external data into the model may require 
different techniques depending on type and data source. 

For example, probability of cure depends on the bank’s default definition. 
A separate description of the cure scenario may therefore be of interest for LGD 
calibration if external data (for example, from a data pooling) is used for estimation 
purposes or if the bank itself has changed its default criteria over time.*© 

As a second example, assume that the bank has a low number of observations for 
some portfolio segments. It may then try to derive estimates (for example, consid- 
ering after-default scenarios) for these segments by comparing key characteristics 
of this portfolio segment with those of other segments where loss observations are 
available. Thus, the institute may obtain an idea of the recoveries it can expect on 
the respective portfolios. However, subjective methods as previously outlined can 
generally only supplement the analysis of external data. 


©This may sometimes be the case during the introduction of Basel II compliant processes. 
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As a third example, consider the estimation of recovery rates for assets where the 
bank does not have own workout experience.*’ A possible approach for deriving 
recovery rates for collateral (in part) from external data can be stated as follows: 


1. Estimate the time series of value depreciation for the specified asset type. 
Sources of information on value depreciation can be market data as well as 
data from brokers or appraisers. 

2. Estimate the time Ar required for repossession or sale. In practice, one may 
observe time series of cash flows, for example rents or leasing rates followed 
by one or several cash flows from the observed asset sale. While such cash flow 
patterns may theoretically also be recognized in a model, it will often be sufficient 
to assume that the total cash flow arises at one point in time. An exposure- 
weighted average time often provides a reasonable reference time. If no recovery 
observations are available, one may refer to experience from similar asset types 
or rely on expert judgement. 

3. Estimate haircuts D for value volatility, distress sale, etc. Again, market data can 
often be a main source of information. Experience from repossession or sale of 
similar assets may also provide useful information for estimating haircuts. In 
addition, one has to determine an appropriate discount factor; see Sect. 9.6.2.4. 


Having determined these parameters, recovery estimates can be generated as NPV 
(Vítor + At)-(1 — D)). To obtain a better idea of the magnitude of recoveries, one may 
also perform scenario analyses or simulations where the input parameters determined 
in the three steps above are varied in order to reflect certain economic scenarios. 

Any substantial dependence between the value of an asset and the default 
event of the borrower should carefully be taken into account, since they may sub- 
stantially decrease the effect of risk mitigation (see also BCBS (2004), § 469). It is 
often helpful to distinguish between general and specific dependences. The first 
named recognizes “normal” dependences which should be reflected in the recovery 
rates discussed so far. The second type addresses an individual characteristic of 
a facility-collateral relation, which is generally difficult to detect automatically. It 
is therefore often reasonable to give credit analysts the possibility to grade 
such dependences manually. These grades can then be used to adjust haircuts on 
recovery rates in an appropriate manner. 


Exposures Secured by Guarantees or Credit Derivatives?" 


Since the risk mitigation effect of a guarantee essentially consists of a (partial) 
transfer of credit risk to a different entity, one may explicitly model the guarantor’ s 


3 . : 
7For unsecured exposures, recovery estimates may be derived from market LGDs; see Sect. 9.4. 


38The following considers guarantees to simplify the presentation. Credit derivatives can often be 
treated in a similar way. 
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default probability as a major driver of the guarantee’s value, i.e. recoveries from a 
guarantee can be described as?’ 


RR; = PD(G|B) - RR?P + (1 — PD(G|B)) - RV? (9.9) 


with PD(GIB) the conditional probability of default of the guarantor given the 
default of the borrower. The parameters RR®? and RR?” are the recovery rates a 
bank may observe in case of an isolated default (SD) of the borrower or a double 
default (DD) of both the borrower and guarantor. One may extend (9.9) analogously 
for cases where an exposure is secured by more than one guarantee (for example, in 
case of a counter-guarantee). The size of a secured exposure, sq;;, can be deter- 
mined in a similar way as described for collateral above, taking into account that 
the reference value of a guarantee is generally defined as a maximum amount, 
v"“(Gar), and/or a certain percentage sqga, of the exposure.*° 

When published first in June 2004, the Basel II Framework restricted risk 
mitigation effects of guarantees by requiring that the risk weight resulting from 
an exposure secured by a guarantee should not be less than that of a comparable 
exposure with the guarantor in place of the borrower. This approach is known as the 
substitution approach, indicating the basic idea of replacing the borrower by the 
guarantor. It has often been criticized for being too conservative. To understand 
why, consider for a moment the borrower as a first guarantor of the contractual cash 
flows. The guarantor then in fact provides a counter-guarantee for these cash flows. 
Therefore, the bank faces substantial losses only if the guarantor is unable to pay at 
the time of the borrower’s default, i.e. in case of a double default. Only if one 
assumes perfect dependence between the two defaults, which will generally not be 
the case, a substitution mechanism will describe the credit risk appropriately. 

With its update in 2005, Basel II now allows for a limited recognition of double 
default effects in both IRB approaches. Restrictions are defined on the set of eligible 
instruments, obligors, and guarantors as well as on the method and the correlation 
parameters.*! A Merton-style default model [see Merton (1974)] is considered to 
determine joint default probabilities of guarantor and obligor. Let Y; be the appro- 
priately normalized asset value of a borrower or guarantor i at a 1-year horizon, 
respectively. With X a systematic risk factor, Zgg a risk factor shared by borrower 


39 Again, j indicates the facility and / the exposure part secured by the guarantee. 


4°Tn practice, the value of a guarantee may depend on further warranty clauses. To mention a few, 
guarantees may cover only a subset of the borrower’s obligations, for example only interest rate 
payments or redemption. They may also be restricted to protect certain risk classes only (for 
example, no political risks). Furthermore, they may (partly) protect residual loss after recovery of 
other collateral and the bankrupt’s assets only. This article does not consider the modifications 
necessary to adequately value such guarantees. Note that some characteristics mentioned above 
may also be incompatible with Basel II requirements for eligible guarantees and can therefore only 
be considered in internal models. 


41See BCBS (2004), §§ 284 (i)-(iii) and 307 (i), (ii). 
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and guarantor, and E; a counterparty-specific risk factor, the asset values of both 
entities can be described as 


Yi =X- JP + Zeo: V1— m Wag t Bi vi- yI- 10) 


X, Zgc, and E; are considered as independent random variables following a 
standard normal distribution. Furthermore, one assumes that counterparty i defaults 
if its asset value, Y;, falls below a threshold x;. Given the default probabilities of 
both entities, the joint probability can therefore be computed as 


JPD(B,G) = ®(®"'(PD(B)), ®"'(PD(G)), pag) (9.11) 


With ® ~'(PD(i)) = x; and pgg = (pg + Pa)?” + Waa: (C — pg) (= pa) 
the correlation between borrower and guarantor. Stressed default probabilities are 
determined by conditioning on the systematic risk factor X. For technical details see 
BCBS (2005) and Heitfield and Barger (2003). 

Both the substitution and the double default approach of the Basel II Framework 
are defined in a way that is most easily implemented in a two-step procedure. 
Firstly, it is necessary to estimate the LGD of borrower and guarantor considering 
the risk mitigation effect of collateral (if available) only. Afterwards, risk mitiga- 
tion effects of guarantees are recognized in a second step by appropriately modify- 
ing the risk-weight of the secured exposure following the substitution rule or 
double-default formula. 

For internal purposes, banks may want to relax the restrictions of Basel II or 
apply their own approach for recognizing double default effects. This can be done, 
for example, by computing recovery rates based on (9.9)-(9.11) or, whenever 
components of LGD are used as input parameters of some simulation model, by 
directly simulating the risk mitigation effect of guarantees within the simulation.** 
The required information about the dependence structure (i.e. correlations) may 
often be available through the bank’s credit portfolio model. Depending on the level 
of conservatism underlying these correlation estimates, one may want to impose 
additional margins of conservatism in order to avoid overestimating the effect of 
risk mitigation by guarantees. As for collateral, a bank may allow credit analysts to 
grade any specific correlation between guarantor and borrower, which may then, for 
example, result in a modified value of Ygg in (9.10).** Estimates of the recovery 


“Tn fact, a bank may use both techniques simultaneously for different purposes. For example, 
explicit simulation of guarantees may sometimes be too time-consuming so that LGD numbers 
already including the risk mitigation effect have to be applied instead. 

431t may sometimes be possible to detect certain types of dependences automatically. For example, 
knowledge on economic interdependence of different addresses, which might be available in the 
institute’s IT-systems (for example, in form of borrower units), can be used to decide whether (or 
to what extent) a guarantee is eligible for a facility of a certain borrower. 
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rates RR? and RR?” can be obtained with only slight changes on the procedures 
described above for assets. 


Further Aspects of Estimating Recovery Rates 


Concluding Sect. 9.6.2.2, the following outlines additional aspects of recovery rate 
estimation not yet considered. 


e Participation effects. Having compensated the lender under a guarantee for any 
obligation due by the borrower, the guarantor generally acquires the right to ask 
for repayment — paid by the borrower (which is generally not possible due to its 
default) or from the recoveries of the borrower’s collateral and bankrupt’s assets. 
Furthermore, guarantees may sometimes only protect residual loss after recov- 
eries from other risk mitigation instruments, etc. Taking these aspects into 
account complicates recovery rate estimation since simply adding the recoveries 
of different instruments may lead to distortion. Furthermore, recovery times of 
single instruments can change significantly, depending on whether other risk 
mitigation instruments also protect the same exposure. A tree representation of 
the transaction and its risk mitigation instruments can be helpful in describing 
these effects and deriving the respective recovery rates. 

e Optimal allocation of risk mitigation instruments. Whenever risk mitigation 
instruments are not clearly assigned to single facilities, the bank may want to 
optimize the allocation.“* This can be done following simple heuristics or by 
solving a (non) linear optimization problem for minimizing risk-weighted assets; 
see Beckmann and Papazoglou (2004) and Gürtler and Heithecker (2005). 

e Multi-period estimation. An institute may want (at least for some applications) 
to generate a multi-period projection of its credit risk numbers. Different tech- 
niques like simulation or scenario computation may be applied for this purpose. 
As a first step, one may also decide to rely on the conservative assumptions of 
Basel II (i.e. applying downturn LGDs as time-independent estimate of future 
LGDs). 

To derive future recoveries from collateral, one can proceed similarly as 
already discussed for estimating recovery rates of assets based on external 
data. The depreciation profile of the respective asset type provides the basis 
for estimating a time series of the asset’s value. Depending on whether recovery 
rates are defined as the net present or nominal value of recoveries, estimates can 
be performed directly by multiplying recovery rate with the predicted future 
asset value or firstly estimating the time of recovery cash flows. For guarantees, 
it is necessary to estimate rating migration and cumulative default probability of 
the guarantor up to the (assumed) default time of the borrower. Furthermore, the 


“The potential for optimization stems from the joint effect of different risk mitigation instruments, 
possible currency mismatches, changes in exposure class due to risk mitigation, etc. 
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value of a guarantee in terms of V’""“(Gar) and SqGar may sometimes change 
over time. 

e Maturity and currency mismatches. Maturity or currency mismatches between 
facility and risk mitigation instruments have to be considered in ex-ante esti- 
mates. Maturity mismatches may be recognized by computing a time-weighted 
average of LGD estimates with and without recoveries of the respective instru- 
ment. Currency mismatches are generally recognized by haircuts, which can be 
derived from an analysis of the volatility of FX rates. This may also require 
taking individual conversion agreements into account. 


9.6.2.3 Estimating Costs 


Similarly, as described for recoveries, costs can generally be assigned to entities (i.e. 
borrower and guarantors), credit products, and risk mitigation instruments (collat- 
eral and guarantees). It therefore often makes sense to break down the workout costs 
of a facility j arising in an after-default scenario sc; into two basic components: (1) 
general costs C osts®(sc;), which reflect all costs of the workout process not related to 
risk mitigation instruments, and (2) specific costs Costs“; sci), which reflect all 
costs (on a secured exposure part sq;j;) related to the handling of risk mitigation 
instruments; for example, costs arising during from the repossession of an asset. 
With respect to (9.7) one then has 


Costs;1(sc;) = Costs} (sci) + 5 Costs; (sci) (9.12) 


k secures sqj1 


Alternatively, one may decide to offset any costs attributed to collateral or 
guarantees directly from the respective recoveries on secured exposures leading 
to RR’ = max{0; RR — Costs*}. If a bank plans to use its LGD estimates for IAS/ 
IFRS purposes as well, the equation should be implemented in a way that allows 
separating internal and external costs since only the second are generally allowed 
entering the respective IAS calculations. 

In practice, measuring direct and especially indirect costs can be difficult. The 
required steps will depend on the institute’s cost accounting system, which may not 
necessarily suits the requirements of LGD estimation. Internal costs may at least in 
part be known only on a level, which is more aggregated than required (for example 
for workout or restructuring units but not for defaulted entities), causing extra 
complications for model development." 

Analysis of the institute’s workout processes may often serve as a starting point 
for modelling the cost component of LGD. This comprises firstly identifying key 
activities or processes causing workout related costs, their respective cost units as 


“Information on external costs will generally be collected in the CLDB. This assures its 
availability. 
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well as possible explanatory variables, for all after-default scenarios. A rough 
classification according to expected cost amounts might be helpful in guiding 
further development. Expert judgment can play an important role at this stage. 

While external costs may be assigned directly to processes, further assumptions 
are usually required to determine internal costs. Estimates of time required for a 
certain activity, the number of persons involved, as well as work intensity (i.e. 
percentage of daily or weekly working hours spend on the task) together with the 
institute’s cost rate per working hour can provide a basis to derive cost estimates for 
workout activities. Ideally, key activities are recorded by appropriate events within 
the loss file so that the institute is able to estimate the lengths of its workout 
activities from past experience. Cost accounting and expert judgment can deliver 
at least a first estimate of the other parameters, whereas a final model may require a 
more detailed analysis of a sample of loss cases. Once key costs have been 
modelled, residual costs can often be distributed proportionally. 

If costs are modelled on a borrower, collateral, or guarantee level, which may 
often be appropriate, LGD computation requires breaking them down to the facility 
level. This can either be done within the estimation procedure itself, i.e. individu- 
ally for each entity and credit facility, or a priori during model development. 
Reasonable distribution keys are in both cases the total exposure of an address’ 
facilities as well as collateral and guarantee values or, alternatively, the number of 
the respective objects. More realistic estimates can generally be expected from an 
individual cost distribution during the estimation process. However, the computa- 
tional effort may be too high with respect to the expected improvement of the 
quality of LGD estimates. 


9.6.2.4 Determining Discount Rates 


Both recovery and cost estimates require net present value computations to take 
material discount effects into account. The choice of discount rate(s) will affect the 
resulting LGD numbers — especially when recovery periods are long. Different 
approaches have been applied and discussed in the literature. Basic characteristics 
for a categorization are: historical vs. present rates, single rates vs. interest rate curves 
as well as the procedure applied to determine the rates or curves, respectively. 

Simple approaches, for example, discounting with the contractual loan rate, the 
effective original loan rate,“° or lender’s cost of capital, have been applied in many 
articles. From a theoretic point of view, it appears most appropriate to discount each 
cash flow using a discount rate that reflects the respective level of risk as well as the 
time required for realizing it. Determining an appropriate discount rate curve for 
each risk class, however, can be difficult. Maclachlan (2004) suggested a procedure 
based on the CAPM that may be useful in this context. 


46Since IAS requires the application of the effective original loan rate, a bank may think about 
applying this rate in its estimates if LGD numbers are used for IAS purposes as well. 
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Discount rates applied in ex-post and ex-ante estimates may differ. Ex-post LGD 
numbers are generally computed using historical interest rates observed at the time 
of default. Discount rates chosen for ex-ante estimates will depend on the applied 
discounted cash flow method. If cash flows are adjusted by margins of conserva- 
tism, the risk-adjusted rate should reflect the lower risk profile, i.e. it can sometimes 
be (almost) the risk free rate. Discount rates applied in downturn LGD estimates 
should also reflect downturn conditions. For a point-in-time LGD estimate, the 
current interest rate curves can be relied on to use the most up-to-date information. 
Combining current interest rates with past loss experience, however, may lead 
to distorted estimates if dependences between interest rate and nominal recoveries 
are not considered adequately. 


9.6.2.5 Determining the Level of Conservatism for LGD Estimates 


The Basel II Framework asks for conservative LGD estimates: 


e LGD has to be estimated so as to reflect economic downturn conditions “where 
necessary to capture the relevant risk” (BCBS (2004), § 468). 

e LGD cannot be less than the long-time default-weighted average (BCBS (2004), 
§ 468). 

e Banks must add a margin of conservatism to their LGD estimates that is related 
to the likely range of unpredictable errors (BCBS (2004), § 451). 

e Institutes must consider dependences between the risk of the borrower and that 
of the collateral as well as the collateral provider. Furthermore, currency mis- 
matches have to be considered conservatively (BCBS (2004), § 469).*” 


The kind of (conditional) LGD expectation defined by Basel II will not always 
correspond to the concepts that banks may have defined for their internal risk 
measurement. Specifically, the required downturn characteristic can be questioned 
for internal application where one generally wants to recognize the economic cycle 
in an explicit manner (point-in-time estimate). Depending on the complexity that a 
bank is willing to accept in its methods, diverging requirements may lead to 
different models or parameterizations of LGD components applied for regulatory 
and internal purposes, respectively. One possibility is to apply the concept, pro- 
posed in BCBS (2004) for non-performing exposures, to performing positions as 
well, i.e. to refer to a best estimate LGD for internal credit risk management’? while 
applying a conservative LGD for regulatory purposes. This article will not discuss 
this rather institution-specific question in more detail. 

If LGD estimates are composed from estimates of their components as discussed 
in this article (see Sect. 9.4), each of the models for these components has to fulfil 


4Means of fulfilling this requirement were discussed in Sect. 9.6.2.2. 
‘8Volatility of LGD then has to be recognized separately in unexpected loss estimates. 
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the requirements mentioned above. When determining the level of conservatism for 
components, the impact on the resulting level of conservatism for the final LGD 
estimate should carefully be considered to avoid too conservative estimates. 

Downturn conditions can be recognized following different approaches. A first 
approach is to identify the subset of loss observations reflecting economic downturn 
and to develop estimation procedures based only on this reference (sub) dataset. 
Time series of macroeconomic variables may be used to identify the respective time 
periods reflecting economic downturn. However, with limited loss observations 
available, this approach will often be a rather theoretical option. Alternatively, one 
may restrict considerations to the marginal distribution of loss observations for the 
considered component, i.e. implicitly recognize economic downturn by choosing an 
appropriately conservative quantile. If the bank intends to develop an LGD model, 
which explicitly recognizes the impact of economic cycles, a more elegant solution 
might be to estimate downturn LGDs by applying this model with downturn 
parameters instead of input parameters reflecting the current economic situation. 

Margins of conservatism can be derived as percentiles from empirical distribu- 
tions, based on appropriate parametric distribution assumptions or, for example, 
from applying resampling techniques as bootstrapping. In practice, observed vola- 
tilities can be large, leading to large margins even for lower confidence levels. 

Practical problems also arise where loss history may not reflect the character- 
istics of future losses. If, for example, a bank redesigns its workout processes or 
changes its workout strategy, future losses may differ from what has been observed 
in the past. Depending on the portfolio it can take several years until the effect of a 
structural break may become visible in loss observations. During that time the bank 
has the difficult task of recognizing the unknown effect of the modification in its 
loss estimates in a conservative manner. Similar problems of data aging may arise 
due to changes in laws, etc. 


9.7 LGD Estimation for Defaulted Exposures 


When estimating LGDs for defaulted facilities, an institute faces a slightly different 
situation than for performing exposures. Besides differences in regulatory require- 
ments (e.g., the need to generate a best estimate and conservative estimate of LGD; 
see Sect. 9.2.1) and possible synergies from collaboration with provisioning pro- 
cesses (see Sect. 9.2.2), the bank will often also be able to estimate LGDs for 
defaulted entities based on better information. Defaulted exposures are generally 
monitored more intensively than performing facilities, resulting in more up-to-date 
and often also more precise information about its current status. The bank will also 
receive additional information, not available before default. This can be explicit 
information, for example decisions taken during the workout process or updates of 
market data, or implicit information, as for example time passed after default. 
Explicit information generally replaces the estimate for some components of 
LGD. One may therefore think of LGD estimates for defaulted exposures as a 


180 C. Peter 


transition from ex-ante LGD estimation to ex-post LGD observation. Update- 
procedures can differ depending on whether the bank keeps the time of default as 
reference time or considers the current date instead. More interesting for practical 
application is generally the second variant, which considers only residual loss, i.e. 


EAD(t) = EAD(tpr)— X off 
tE|tpr,t) 
[NPV (CFR, t > t) — NPV(CFC*s,¢ > t)| 
EAD(t) 


(9.13) 


LGD(t) =1 


with cf and CF the realized or expected recovery cash flows and costs. While the 
update-scheme itself has a simple structure, its implementation can become com- 
plicated. In particular, the update of EAD and LGD requires that the sources of all 
cash flows can be automatically identified. 

Implicit information, for example, time after default or certain events observed 
after default, may be used in estimates of NPL-LGDs by considering (abstract) 
states of information as additional explanatory variables or, more generally, state 
space models. As an example, consider cure probability as a decreasing function of 
time after default in a model following (9.3)-(9.7). While theoretically appealing, 
estimating such models requires large reference datasets and relatively homoge- 
nous portfolios if not (partly) parameterised by expert judgement. Portfolios of 
standardized retail exposures may therefore be the main field of application. 

Purely statistical approaches will often not be able to capture all information 
available for individual defaulted facilities. For example, recoveries from bank- 
tupt’s assets or collateral as well as costs or payment dates can often be estimated 
more precisely based on the specific information available for a defaulted entity. 
An LGD estimate may therefore be improved by allowing overrides for some of 
the model’s input parameters or the purely statistical LGD estimate. This can also 
affect the model design. 

Since provisioning requires similar information to loss estimation, it may be 
reasonable to link the two processes in order to use a consistent set of information, 
avoid process redundancies, and let provisions and LGD estimates converge as far 
as possible, which may also simplify internal communication of these numbers. 
Links may be established in both directions, as depicted in Fig. 9.4: A statistical 
LGD model may deliver information concerning the loss distribution of a defaulted 
entity or credit product as well as other useful information, for example, expected 
collateral recoveries etc. These may serve as a basis or reference for determining 
provisions in a provisioning tool. During the provisioning process, the responsible 
analyst may then modify or supplement estimation parameters based on her infor- 
mation or expectations about the respective loss case.“ These inputs can afterwards 
be used to improve LGD estimates. 


4For example, she may elect the respective after-default scenario or modify the time structure of 
future recovery cash flows. 
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Fig. 9.4 Connection of LGD estimation for nonperforming exposures and provisioning”? 


Before any input from the provisioning process may enter NPL-LGD estimates, 
it is necessary to analyse any differences in the respective valuation approaches 
applied by the bank. It should be kept in mind that coupling the two processes can 
further complicate the implementation of the estimation process. The decision of 
whether and how the two processes should cooperate often depends on the respec- 
tive portfolio. For example, loss experience for standardized or retail portfolios will 
generally provide a sufficient basis to develop and apply more advanced machine- 
driven estimation procedures. In this case, the bank may want to derive its provi- 
sions from LGD estimates (but not vice versa). The opposite will probably become 
true for customized credit products where expert judgment may prove more valu- 
able than limited empirical loss experience. 

Due to regulatory requirements, institutes have to determine a best estimate of 
loss (LGD®*) as well as a conservative estimate (LGD) for Basel II purpose. As 
discussed in Sect. 9.6.2.5, some banks may apply a similar scheme for performing 
facilities as well. All procedures described so far in this article can be considered to 
generate best estimate LGDs (in the sense of the best possible estimate of expected 
loss quotas). Conservative estimates may be generated by appropriately stressing the 
best estimate. This may be done be stressing input parameters of the estimation 
procedure (for example, recovery rates for collateral, workout periods, etc.) or the 
resulting estimate. Empirical distributions of historically observed parameter values 
(e.g. recovery rates of certain collateral types, etc.) or loss quotas may help to define 
stress factors. Sometimes, the same or similar stress factors as already used for 
performing loans may be applied for non-performing loans as well. Sometimes, one 
may expect stress factors to be smaller after default due to more precise information 
about the economic situation of a defaulted entity. However, appropriately stressing 


5°(N)PL-LGD is used as an abbreviation for an LGD of a (non)performing exposure. 
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human judgment (which may enter when applying procedures as outlined above) in 
LGD estimates can be difficult. Depending on its impact on the estimate, one may 
simply ignore them in conservative LGD estimates.”! 


9.8 Concluding Remarks 


This article provides a general survey of LGD modelling from a practical point of 
view. Due to the scope of the article, various aspects including most technical 
details could not be covered. Several aspects of LGD estimation are still topics of 
discussion and current research. Two important examples are 


e Lack of loss history. Estimating LGD for exposures of portfolios with little or no 
defaults is a difficult but common problem. But even for portfolios where loss 
data is in principle available, it may not always be representative for the future 
due to internal or external changes, for example, modifications in workout 
strategy or relevant laws. Some simple approaches to deal with this situation 
have been outlined in this article; however, additional research is recommended. 

e Validation. While not considered in detail within this article, model validation 
forms an important part of LGD methodology. BCBS (2004) requires all banks 
applying the advanced IRB approach to validate their rating systems and pro- 
cesses on an annual basis. Little has been published on the validation of LGD 
models; see for example Bennett et al. (2005). Some methods may be taken from 
PD validation, which already provides more advanced concepts’; however, 
specific characteristics of LGD estimation approaches will probably require 
adjustments or the development of new validation approaches. The lack of 
loss data will again complicate the application of quantitative tools for some 
portfolio segments. A unification of validation techniques, processes, and reports 
for the risk parameters PD, EAD, and LGD appears reasonable to reduce costs 
and promote an understanding of the results within the institute; however, little 
can be found in the literature on this topic. 


Many further, less prominent topics arise from daily work within the conflicting 
fields of statistical significance, degree of detail desired for different applications, 
and cost-benefit aspects. One may therefore expect and look forward to see further 
interesting developments within the field of LGD estimation. 


5!One may also think about allowing analysts to judge the uncertainty of recoveries as well, giving 
them the possibility to influence stress factors, etc. Any degree of freedom in the applied 
procedure, however, may not only improve the quality of estimates but also bears the danger of 
deterioration and generally also complicates the whole procedure — from implementation and 
workflow aspects up to a later validation. 


52Cf. Chaps. 14 and 15. 
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Chapter 10 
Possibilities of Estimating Exposures 


Ronny Hahn and Stefan Reitz 


10.1 EAD Estimation in Line with the Loss-Parameter- 
Estimation of Basel II 


10.1.1 Definition of Terms 


The exposure at default (EAD) is defined as the expected amount of a receivable 
at the time a default happens. In order to describe the borrower-related-risk the 
EAD has to be set economically before provisions are considered.' Provisions 
that are utilizable bank-internally only serve to cover the equity in the balance 
sheet in case of losses as possible losses already have reduced the risk bearing 
capacity of the bank at the moment of risk identification by the realization of 
provisions. 

This definition shows that in a first step the EAD is determined by the exact time 
of default. If observed economically the EAD to be expected depends on the 
horizon of default, i.e., it makes a difference of this horizon consists of 1 or of 
2 years. According to regulatory prescriptions the EAD must not be lower than the 
book value of a balance sheet receivable.” Therefore a regulatory necessity to 
estimate future EADs for such positions is not given. 

Credit conversion factors (CCF) have to be estimated for non-balance-sheet 
transactions and credit approvals. They describe the percentage rate of credit lines 


"Cf. BCBS (2006), §308. 

?Cf. BMF (2006), §100 seqq. 
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Fig. 10.1 Difference between exposure and credit approval 


(CL) that have not been paid out yet, but that will be utilized by the borrower until the 
default happens. Therefore the EAD is defined as: 


EAD = CL: CCF. 


For credit lines that already have been paid out (balance sheet receivables) the 
CCF is defined as 100%. The estimation procedure for credit conversion factors is 
further illustrated in Fig. 10.1. 


10.1.2 Regulatory Prescriptions Concerning the EAD 
Estimation 


Regulatory prescriptions concerning estimations of loss parameters and therefore 
also the EAD are mainly given in the regulatory rules related to Basel II. Within the 
capital provisioning requirements that are defined here, three separate approaches 
for the fixing of risk assets are distinguished. In the Standardized Approach (SA) 
and the Foundation Internal Ratings-Based Approach (FIRB) there is no freedom as 
far as the estimation of the EAD/CCF is concerned. This is due to the fact that the 
CCF related to classes of receivables is prescribed by regulatory entities. Specific 
minimum requirements on eligible EAD estimates are defined in the Advanced 
Internal Ratings-Based Approach (AIRB)*: 


e “EAD for an on-balance sheet or off-balance sheet item is defined as the 
expected gross exposure of the facility upon default of the obligor. For on-balance 


3Cf. BCBS (2006), §474 seqq. 
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sheet items, banks must estimate EAD at no less than the current drawn amount, 
subject to recognizing the effects of on-balance sheet netting as specified in the 
foundation approach. ..” 

e “Advanced approach banks must have established procedures in place for the 
estimation of EAD for off-balance sheet items. These must specify the estimates 
of EAD to be used for each facility type. Banks’ estimates of EAD should reflect 
the possibility of additional drawings by the borrower up to and after the time a 
default event is triggered. Where estimates of EAD differ by facility type, the 
delineation of these facilities must be clear and unambiguous.” 

e “Advanced approach banks must assign an estimate of EAD for each facility. 
It must be an estimate of the long-run default-weighted average EAD for 
similar facilities and borrowers over a sufficiently long period of time, but 
with a margin of conservatism appropriate to the likely range of errors in the 
estimate. If a positive correlation can reasonably be expected between the 
default frequency and the magnitude of EAD, the EAD estimate must incor- 
porate a larger margin of conservatism. Moreover, for exposures for which 
EAD estimates are volatile over the economic cycle, the bank must use EAD 
estimates that are appropriate for an economic downturn, if these are more 
conservative than the long run average. For banks that have been able to 
develop their own EAD models, this could be achieved by considering the 
cyclical nature, if any, of the drivers of such models. Other banks may have 
sufficient internal data to examine the impact of previous recession(s). How- 
ever, some banks may only have the option of making conservative use of 
external data.” 

e “The criteria by which estimates of EAD are derived must be plausible and 
intuitive, and represent what the bank believes to be the material drivers of EAD. 
The choices must be supported by credible internal analysis by the bank. The 
bank must be able to provide a breakdown of its EAD experience by the factors it 
sees as the drivers of EAD. A bank must use all relevant and material informa- 
tion in its derivation of EAD estimates. Across facility types, a bank must review 
its estimates of EAD when material new information comes to light and at least 
on an annual basis.” 

e “Due consideration must be paid by the bank to its specific policies and strate- 
gies adopted in respect of account monitoring and payment processing. The bank 
must also consider its ability and willingness to prevent further drawings in 
circumstances short of payment default, such as covenant violations or other 
technical default events. Banks must also have adequate systems and procedures 
in place to monitor facility amounts, current outstandings against committed 
lines and changes in outstandings per borrower and per grade. The bank must be 
able to monitor outstanding balances on a daily basis.” 


Apart from that specific minimal estimation periods are defined concerning 
classes of receivables. Regulatory prescriptions clearly show the high qualitative 
and quantitative demands banks have to meet when using the AIRB. In practice the 
utilization of a bank’s internal model to bring the estimation method for the EAD 
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into unison with the other loss parameters, probability of default (PD) and loss 
given default (LGD), is independent from the question whether the model is utilized 
for internal or for regulatory requirements or for both. 


10.1.3 Delimitation to Other Loss-Parameters 


Next to a clear definition of the specific parameters and the best possible 
data quality a uniform definition of default is most important for a methodo- 
logically correct internal estimation of loss parameters for credit risk. The first 
primary loss parameter is the PD which describes the probability of default of 
a borrower within a predefined period — usually 1 year. The statistical methods for 
estimating the PD are described elsewhere in this book in Chaps. 1-3 and 5. 

The estimation period of the PD is identical with the estimation period of the 
EAD — 1 year. The difference between those two parameters lies in the data basis 
that is needed. Concerning the PD estimation the general question is how many of 
the original customers (at time fo) will default. Therefore the overall portfolio has 
to be considered. Concerning the CCF estimation the data basis is reduced and 
only those credit lines where a default took place within the period of observation 
(1 year) are included ex post. Within an economic focus and taking amortization 
effects into consideration all receivables have to be considered for the EAD 
estimation. 

LGD, that describes the fraction of the defaulted amount of receivables (EAD) 
that finally leads to a loss for the creditor, is the third major component of credit risk 
and expected loss EL (EL = EAD - PD - LGD). The LGD estimation, similar to the 
CCF estimation, depends on defaults that already have taken place. Only on 
the basis of the defaulted receivables it can be measured empirically which part 
of the default-volume will lead to an economic loss for the bank. The biggest 
difficulty concerning the empirical LGD estimation is the relatively limited data 
amount and the long duration of the estimation period. While PD and EAD/CCF 
estimation with an estimation horizon of 1 year already requires a very long period 
compared to the estimation of market price risk parameters, LGD estimation 
periods even reach on average 3-5 years. Experience within banks shows that the 
liquidation of defaulted engagements — including the realization of collateral — 
takes such long periods. For this reason, a backtest of LGD estimations becomes 
increasingly difficult or nearly impossible. 

The following Fig. 10.2 shows the relations between EAD-CCF-LGD in a 
scheme. The figure shows that LGD always refers to the EAD. Concerning estima- 
tions a simple motto might be derived from this fact. Everything that takes place 
until the default happens has to be considered in the EAD estimation. All payments 
after this event only influence the estimation of LGD. 
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Fig. 10.2 Relation between EAD-CCF-LGD 


10.1.4 Regulatory EAD Estimation for Derivative Products 


If we look at derivative products like interest rate swaps, caps, floors, swaptions, 
cross currency swaps, equity swaps, or commodity swaps, two kinds of counter- 
party risks have to be considered: settlement and pre-settlement risks. 

Settlement risks occur if the payments are not synchronous: This is for example 
the case if Bank A has paid a EUR cash flow in a cross currency swap to Bank B 
before it has received the USD cash flow. So the risk consists in the missing USD 
cash flow. If Bank B defaults a loss in the amount of this cash flow would occur. 
Settlement risks obviously mostly have a short-term character. 

Much more important are the pre-settlement risks. Characteristic for pre-settle- 
ment risks is the following situation: Bank A expects in the future a rising interest 
rate curve. For hedging its loan portfolio Bank A makes a payer swap with Bank B. 
This transaction eliminates the interest rate risk but creates a counterparty risk. If 
Bank B defaults during the lifetime of the interest rate swap, Bank A has to look for 
a new counterparty to make the same payer swap with this new counterparty. If in 
the meantime the interest rate curve has moved up, the replacement with an 
identical swap will only be possible by paying an upfront payment to the new 
counterparty. Pre-settlement risks have a long term nature because they may occur 
during the whole lifetime of the derivative product. 

For calculating the EAD for derivative products it has to be noted that the EAD 
consists of two parts: 


e The current exposure (CE): This is the replacement cost of a derivative transac- 
tion if the counterparty defaults immediately, and is given by the actual market 
value of the instrument if this market value is positive. If the market value is 
negative it is zero: CE = max{market value; 0}. 
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Fig. 10.3 EAD calculation for derivative products in the regulatory context 


e The potential future exposure (PFE): This is an estimate for the increase in 
market value to a pre-specified time horizon (e.g., | year). It should be calculated 
using probability analysis based upon a specific confidence interval. So the EAD 
is given by: max{ market value; 0} + PFE. 


For calculating the PFE for regulatory purposes fixed add-on factors, which have 
to be applied to the nominal amount of the contract, are used. Figure 10.3 demon- 
strates the calculation of the EAD for derivative products in the regulatory context. 

If there is a netting agreement with the counterparty, the negative and positive 
market values of all derivative contracts, which are included in the netting agree- 
ment, can be offset against each other and the current exposure of all these contracts 
is therefore given by: 


CE = max Ñ` market value;; 0 (10.1) 


For the potential future exposure a total offsetting of the various PFE’s of the 
various contracts is not allowed. A so called “PFE-floor” which is given by 40% of 
the sum of the PFE’s of the various derivative contracts must be provided. The 
remaining 60% depend on the “market value structure” of the bilateral derivative 
portfolio. Overall, the PFE under a netting agreement is given by: 


L 


max { X market value;; o} 
(10.2) 


PFE=0,4:-X PFE;+0,6:4 X PFE; l 
; 2, a 2, >> max{marketvalue;; 0} 


If there are further collateral agreements then the EAD can be reduced by the 
amount of the collateral. 
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Overall, the following shortcomings of the way how the EAD for derivative 
portfolios is calculated can be stated: 


e The add-on-factors are static. The actual volatilities and correlations of the 
economically relevant risk factors are not taken into account. 

e The specific product structure is neglected in the add-on-factors, e.g., for an 
interest rate swap and an interest rate cap the add-on-factor is the same as long as 
they fall in the same maturity time band. 

e There is no offsetting between negative market values and the PFE allowed. So 
an interest rate swap with market value O and an interest rate swap with a 
negative market value will lead to the same EAD. 

e There is only a very rough differentiation between the add-on-factors for pro- 
ducts with different maturities. For instance, an interest rate swap with a 
maturity of 6 years has the same add-on-factor as an interest rate swap with a 
maturity of 30 years, although the two products react completely different to 
changes in the interest rate curve. 

e No amortization effects are recognized. In the described “regulatory proceed- 
ing” the cash flows of a product, which are paid before the proposed default point 
(e.g., 1 year) should not be considered in the CE. 


Banks who use the AIRB may use more elaborated techniques for calculating the 
EAD of derivative portfolios. These techniques will be explained in Sect. 10.2.2. 


10.2 Banks’ Own Methods of EAD Estimation 


10.2.1 Introduction 


The method for EAD estimation depends on the product category. In the case of 
credit lines banks will use empirical methods (see Sect. 10.2.2) and for derivative 
products their own internal approaches (see Sects. 10.2.3 and 10.2.4). 

Under the Internal Model Method (IMM) of Basel II banks are allowed to derive 
estimations of EAD for derivatives using their own internal approaches. The key 
elements of such approaches are statistical methods for simulating future distribu- 
tions of credit exposures resulting from (netted) portfolios of financial instruments. 

The benefit from using the IMM instead of other methods such as the Current 
Exposure Method or the Standardised Method is the fact that the IMM is more risk- 
sensitive since it allows banks to apply very sophisticated and portfolio specific 
techniques which improve the measurement of counterparty credit exposure. 


10.2.2 Empirical Models for Credit Products 


The delimitation between the loss parameters fulfils the main requirement 
concerning the creation of an internal empirical data collection model for the EAD. 
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Fig. 10.4 Realized cash flows in the example 


It is advisable to contain balance sheet exposures in addition to open lines in an 
internal empirical model even if the respective information cannot be integrated into 
the regulatory exposure estimation. 

In general, variations in the EAD should not be underestimated even in the case 
of balance sheet exposures. This will be shown in a short example. Two credits 
A and B that are utilized at a level of 100 € and bear an interest rate of 6% shall be 
given. For Credit A an annual repayment of 12% and for Credit B a monthly 
repayment of 1% are agreed upon. If we assume that both borrowers stop paying 
their annual repayments after 11 months, meaning that ninety days after this a 
default in accordance with the default definitions of Basel II occurs (Fig. 10.4). 

This means for Credit B that at the time of default a total amount of receivables 
of 100 € is given. Apart from the 6% interest = 6 € for 1 year and the interest for the 
90-days-period of about 1.6 € have not been paid. This sums up to a total receiv- 
ables amount (EAD) of 107.6 €. For Credit A this means that 11 repayments and 
interest payments have been duly effected and that therefore the remaining amount 
of receivables remains at 89 €. Interest payments for the 12th month and for the 
excess period of 1.75 € have to be added. The total amount of receivables that is to 
be demanded from the customer sums up to about 90.75 €. This delimitation is of 
prior importance also concerning the LGD estimation. If we assume that the debtor 
of Credit A pays back the total receivable amount of 90.75 € and additionally all 
costs related to administration, no loss occurs for the bank neither economically nor 
concerning the balance sheet. But if we assume the regulatory EAD definition for 
balance sheet receivables, the EAD sums up to 100 € and a loss (LGD) of about 
10% occurs. 

The regulatory approach that the parameter effects (EAD vs. LGD) of cases 
A and B cancel out each other can be accepted as far as capital provisioning is 
concerned but should not be accepted as far as an economic observation is given. 
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Fig. 10.5 Development of the exposure related to receivables on the balance sheet 


This is due to the fact that seen from a risk perspective the parameter PD or default 
correlations have to be added. Figure 10.1 therefore has to be adapted concerning 
balance sheet receivables as follows (Fig. 10.5). 

When constructing empirical models it makes sense to define the EAD in market 
values. In Case B the customers pays interests of 6%. If we assume that 1 year 
before default an interest rate of 5% is in line with the market for this customer, a 
cash-value of more than 100% results. Within an economic observation this claim 
to profits is not realizable for the bank. A potential refinancing loss starting from the 
date of default has to be considered in the frame of the LGD. 

After fixing the mentioned methodological framework for an internal empirical 
model its creation is relatively simple. In general, the following requirements have 
to be met: 


e Storing all EAD and CL related information at least for 1 year and concerning all 
accounts, if necessary including market interest rates, conditions, cash flow 
structures, etc. 

e Segmentation of classes of receivables to create pools for the EAD/CCF estima- 
tion, i.e., loans for home construction, current account overdrafts, or guarantees. 
The practical experience shows that it is important to differentiate by products 
where the credit commitment will be cancelled in the default event (i.e., invest- 
ment finance) and products which allow further drawings after default (i.e., 
guarantees). 

e Classification according to classes of receivables, ratings, etc. 

e Segmentation by the drawn level 1 year before default (10% of line or 90% of 
line). 


The last two points refer to the necessity of a clear definition of the aggregation 
level of the survey. The schemes of those are depicted in the following Fig. 10.6. 
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Fig. 10.6 Levels of aggregation of the EAD survey 


10.2.3 Exposure Measurement for Derivative Products 


In the event that a counterparty in a derivative transaction will default, the position 
will be closed out and there will be no future contractual payments. Depending on 
the mark-to-market (MtM) value of the transaction at the time of default, two cases 
are possible: 


e If MtM is positive, a loss is realised. The size of this loss is the MtM value at 
default time minus any recovery value. 
e If MtM is negative, no loss is made. 


The EAD as seen from today is thus a random variable defined by 
EAD = max(MtM, 0). 


As only positive MtM values are relevant, it is natural to define the expected 
exposure (EE), which is the expected value of max(MtM,0) of a single transaction 
or a portfolio of transactions (including netting effects and collaterals). 

Within credit risk management an important question is what the worst exposure 
could be at a certain time in the future. This question is answered by an exposure 
measure called potential future exposure (PFE) which was already mentioned in 
Part 1. In terms of statistics, the PFE is an exposure at a certain time in the future 
that will be exceeded with a probability of no more than «1%. We realize that PFE is 
a quantile (the 1 — «% quantile) of the distribution of future MtM values (Fig. 10.7): 

In the simplest case, when MtM is a normally N(u, o°) distributed random 
variable, we have 


EE = | max(u +0 x, 0) dx = | (u+ o- x) - p(x) dx 
= B/ 


= u: O(u/o) +0: p(u/o). 
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Fig. 10.7 EE and PFE 


In reality, MtM distributions of complex derivative portfolios have to be 


simulated by a Monte Carlo simulation. This means that all market variables 
(including their correlations) that influence future portfolio values have to be 
simulated including all portfolio characteristics such as path dependencies, netting 
agreements and collateralisation. In principle the following steps have to be carried 
out (cf. Gregory 2010 and Cesari 2009): 


Choice of Risk Factors: The set of risk factors typically includes (depending on 
the type of transactions in the portfolio) all underlyings and their volatilities: FX 
rates, interest rates, credit spreads, equity and commodity prices, or implied 
volatilities. These factors may be modelled in a simple one-factor model or a 
more complex multi-factor approach, e.g., a multi-factor interest rate model. Of 
course there will always be a trade-off between model sophistication and 
tractability of the simulation. Whatever model is chosen, the key issue will be 
that future multivariate distributions of market parameters are predicted in a 
reasonable and efficient way and that the model is well calibrated to current 
market data. 

Generation of Scenarios: In order to generate scenarios of the risk factors, a 
time grid has to be defined, which includes all future points in time t;, for which 
risk factors realisations are needed. The number and spacing of simulation 
points depends on the structure of derivatives within the portfolio. In practice, 
exposure profiles can be highly discontinuous over time due to maturity dates, 
option exercise, cash flow payments and amortisation. The risk of missing 
jumps in exposure is called the roll-off risk. The final simulation date t, hat to 
be greater than the maturity of the instrument with the longest maturity within 
the portfolio. Typical values n for the number of simulation points are within 
the range 50-200. The following Fig. 10.8 illustrates a simulated set of MtM 
scenarios as well as calculated EE and PFE values (shown in the bottom chart). 
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Fig. 10.8 MtM paths, EE and PFE paths 


e Portfolio Valuation: All positions within the portfolio under consideration have 
to be revalued in every scenario and at each point in time t;. It is important to 
avoid extremely complex valuation models here as the number of instrument 
revaluations is enormous. If the number of counterparties is denoted by x and the 
(average) number of trades per counterparty by y we have (for n = 100 and 
10,000 scenarios per time step) to perform x-y-100-10,000 revaluations — several 
billion of revaluations for large portfolios! The need of (crude) analytical 
approximations for pricing formulas is obvious. 

e Aggregation: As a result of the scenario generation and portfolio revaluation we 
will have a (huge) matrix of MtM-values for each single transaction of our 
portfolio. For each point in time t; and scenario k all transactions belonging to a 
specific netting set (a set of transactions with a counterparty under certain netting 
conditions) the exposure E; is defined as 


PpP 
Eik = max XO PViis 0 
I=1 


Here, PV;;; is the PV of trade / in t; and scenario k, where all the trades with 
indices / € {1, ...,p} belong to one netting set. 
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e Consideration of Collateral Effects: For each exposure path, we have to apply 
effects from collateral agreements which can reduce the exposure dramatically. 
If the credit exposure against a counterparty is uncollateralised, it is necessary to 
model the future distribution of risk factors over the full time horizon of all 
transactions. In this case, typical long-term assumptions such as mean reversion 
and drift have to be carefully considered. In case we have a partial (or full) 
collateralisation we will have to model counterparty exposure over much shorter 
periods (the remargin frequency). This can be done by VaR-methodologies 
know form market risk. 

e Calculation of Risk Measures: Using the simulated paths of exposure figures Fx, 
a number of different statistical measures can be determined, for example the 
expected exposure FE; and the PFE; of a netting set for time ¢;: 


1 


EE; = 
K 


K 
So Eis PFE; = qiza% (Eix : k € {1,...K}), 
k=1 


Figure 10.9 shows PFE;, EE; and the maximum exposure for (1) a single interest 
rate swap and (2) a portfolio of two interest rate swaps. 

There are a number of additional exposure measures which play an important 
role in EAD estimation. For our purposes we will need the following parameters: 

The expected positive exposure (EPE) is defined as the average expected 
exposure through time and can be interpreted as a single number representation 
of exposure; its formal definition is 


EPE := XEFE; - Ati. 


t<1 


It is the time weighted sum for all time points less than or equal to | year. 

In the simulation of future scenarios one typically observes that the number 
of remaining trades within a netting set and the number of remaining cash flows 
will decrease. As a consequence, the “amortisation effect” starts dominating the 
“diffusion effect” at some point t; and so EE; decreases for increasing values of t;. 
As a certain percentage of expiring trades are likely to be replaced by new trades 
(esp. short-term trades), within Basel II the parameter EPE has been replaced by 
the so called effective expected positive exposure (EEPE) that is calculated as 
follows: 

In the first step the effective expected exposure (EEE) is calculated by the 
following recursive definition (valid for t; < 1): 


EEE; = max{EEE;_\, EE;}, EEEo = EE. 


The idea behind this definition is — to avoid the amortisation effect mentioned 
above — expiring trades will not reduce the value of the effective EE. For t; > 2 we 
define EEE; := EE;. 


198 R. Hahn and S. Reitz 


140.000,00 5 
Maximum Exposure 
120.000,00 5 P 
100.000,00 4 
p 80.000,00 5 a 
=] 
2, 60.000,00 5 PFE | 
a 
40.000,00 + 
~— EE h | 
20.000,00 + =] | 
0,00 + : ; i i i | 
0 1 2 3 4 5 6 
Years 
200.000,00 5 
Receiver Swap 6 Years 
150.000,00 4 os 
— 
100.000,00 + fees 
R 
A 
© 50.000,00 + b = 
g Total PFE | 
a 
0,00 j 7 1 P i j 
1 2 3 4 5 6 
-50.000,00 + Pe 
Payer Swap 3.5 Years 


—100.000,00 4 
Years 


Fig. 10.9 Exposures from swap positions 


After having defined EEE we can calculate the effective EPE (EEPE) in the 
same way we derived EPE from EE: 


EEPE = 5 EEE; - At;. 
At;<1 


This definition minimizes the roll-over risk coming from short term OTC 
derivatives or repo style transactions which lead to an underestimation of EPE. 


10.2.44 Estimation of EAD for Derivative Products 


The IMM approach in Basel II allows banks to estimate EAD using cross-product 
netting. This means that within a predefined netting set of transactions, an EE profile 
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(including portfolio effects) has to be created and from this the EEPE profile is 
calculated as described above. After that, the EAD is defined as 


EAD = «- EEPE, 


where & is a multiplier, which reflects the granularity and concentration of the 
portfolio under consideration. In the (hypothetical) case of a portfolio with infinite 
diversification, « will be 1. In reality, portfolios consist of a finite number of 
counterparties. Therefore, there are non-zero correlations between exposures and 
there might be wrong-way risk (i.e., a non-zero correlation between exposures and 
default events) which leads to an & factor larger than 1. 

Why do we need the factor «? As we replace the set of all possible future 
exposure-paths for each counterparty by a single number (the EPE), we are calcu- 
lating the economic capital by replacing random exposure distributions through 
non-random EPE figures per counterparty. 

It can be shown (cf. Wilde 2001) that in the case of a portfolio with an infinite 
number of counterparties with small exposures (infinite diversification), where zero 
correlation among exposures and between exposures and default events can be 
assumed, the economic capital of the actual portfolio equals the economic capital of 
a hypothetical portfolio which consists of non-random exposures of the size EPE 
for each counterparty: 


economic capital (actual portfolio) = 


economic capital (portfolio with EPE exposures). 


In this context, EPE is an accurate /oan-equivalent measure for calculating 
economic capital (the term loan equivalent is used for a fixed amount that replaces 
a random exposure in the process of capital calculation). 

Now, for real portfolios the above mentioned conditions are not satisfied. This 
means that the following ratio is larger than 1: 


economic capital (real portfolio) 


~ economic capital (portfolio with EPE exposures) ` 


The IMM approach in Basel II allows banks to define the factor a by an own 
estimation instead of using the fixed value of « = 1.4. There is a floor of « = 1.2 
for bank internal estimations of & in order to limit model risk. 

A procedure for an estimation of « for a given portfolio could be as follows: 


e Consider a portfolio with a given number y of counterparties with an average 
probability of default PD and a given asset correlation p. 

e Specify a MtM distribution for a given time horizon for each counterparty in 
the portfolio. 

e Calculate the EPE based on the MtM distribution for each counterparty. The 
EPE for the whole portfolio is the sum of the individual EPE values. 
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e Calculate the distribution of losses in two cases: 


1. Random exposures at the default time point; independent exposures for 
different counterparties 

2. Fixed individual exposures (EPE-value) for each counterparty at the default 
time point 


e Compare any economic capital measure (e.g., the 99% quantile of the loss 
distribution) in both cases; the ratio of both numbers defines the factor o. 
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Chapter 11 
EAD Estimates for Facilities with Explicit Limits 


Gregorio Moral 


11.1 Introduction 


The estimation of exposure at default, EAD, for a facility with credit risk, has 
received a lot of attention, principally in the area of counterparty risk and has 
focused on situations where the variability of the exposure is due to: the existence of 
variability in the underlying variables of a derivative; the use of a fixed nominal 
amount not expressed in the presentation currency; or the existence of collateral 
whose value (variable over time), reduces the exposure. Less attention has been 
given to the case of loan commitments with explicit credit limits. In this case, the 
source of variability of the exposure is the possibility of additional withdrawals 
when the limit allows this. The implementation of Basel II is forcing credit institu- 
tions to address this problem in a rigorous, transparent and objective manner. 
Moreover, Basel II imposes a set of minimum conditions on the internal EAD 
estimates in order to allow the use of these as inputs in the calculation of the 
minimum capital requirement. Currently, credit institutions have problems meeting 
the requirements of both the data and the methodologies. 

This chapter analyses various methods for estimating EAD for facilities with 
explicit limits and tries to assess their optimality from both an internal and a 
regulatory point of view. It focuses on objective methods, based on a reference 
data set (RDS) extracted from observed defaulted facilities, which are frequently 
used in practice by banks. Section 11.2 presents the definition of realised conver- 
sion factors (realised CFs) that are the basic input in most of the estimation 
procedures. Section 11.3 describes several approaches for computing realised 
CFs: “Fixed Time Horizon”, “Cohort Approach”, and “Variable Time Horizon” 
and summarises their pros and cons. Section 11.4 explores issues that have to be 


'The views expressed in this paper are the responsibility of the author and do not necessarily reflect 
those of Banco de Espana. 

G. Moral 

Banco de España! 

e-mail: Gregorio. Moral@bde.es 


B. Engelmann and R. Rauhmeier (eds.), The Basel II Risk Parameters, 201 
DOI 10.1007/978-3-642-16114-8_11, © Springer-Verlag Berlin Heidelberg 2011 


202 G. Moral 


addressed before estimating EADs such as: structure and scope of the reference data 
set (RDS); data cleaning; treatment of observations with negative or greater than 
one CFs; and risk drivers. Section 11.5 focuses on EAD estimates. First, it estab- 
lishes the equivalence between EAD estimators and CF estimators under certain 
conditions. Second, the most common methods used by banks in practice are 
presented as special cases of optimisation problems. It concludes that these methods 
are solutions for regression problems with quadratic and symmetric loss functions. 
Section 11.6 discusses issues related to the optimality of the estimates and intro- 
duces a different kind of loss function, one that is linear and asymmetric. These loss 
functions are naturally linked to Basel II capital requirements and they are used to 
derive optimal estimators that, consequently, could be more appropriate when the 
estimates are used for computing capital requirements under Advanced Internal 
Ratings-Based approaches (AIRB). Section 11.7 illustrates issues discussed in the 
previous sections and the consequences of using different estimation methods with 
a stylised but realistic example. Finally, Sect. 11.8 summarises the current practice 
on CF and EAD estimation, highlights problematic aspects, suggests possible 
improvements and concludes that traditional methods, based on averages, are less 
conservative than those based on quantiles. 


11.2 Definition of Realised Conversion Factors 


In practice, when estimating the EAD for a non-defaulted facility, f, with an explicit 
credit limit,” there are two main classes of methods in terms of the basic equation 
used to link the estimated EAD with the limit: 


e In the first class, estimates of the EAD are based on a suitable conversion factor 
for the total limit of the facility, EAD(f) = CCF(/) - Limit(f). 

e Inthe second class, estimates of the EAD are based on another factor? applied to 
the undrawn part of the limit, EAD(f) = Current Exposure(f) + LEQ(f) - 
Undrawn Limit(f).4 


For example, credit lines which are committed, i.e. the borrower can draw additional amounts 
until a limit L(t) is reached. 

3In the Revised Framework and the Capital Directive such factors are called Credit Conversion 
Factors (CCFs) and Conversion Factors (CFs) respectively. In the drafts of Rules for Implementa- 
tion of Basel II in the US the factor used is called LEQ factor and the Guidelines by CEBS uses the 
term Conversion Factors (CFs). In this chapter, for clarity, conversion factors that are applied to 
the undrawn amount are called Loan Equivalent (LEQ) factors and the term Credit Conversion 
Factor, CCF, is reserved for the factor related to the total limit. 

“This is the approach required for these types of facilities in the Revised Framework, the Capital 
Directive, the drafts of Rules for Implementation of Basel II in the US, and in the CEBS 
Guidelines. 
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Fig. 11.1 Definition of realised LEQ factor 


As it is shown in Sect. 11.5, both approaches are equivalent and the problem of 
EAD(f) estimation can be reduced to the estimation of suitable conversion factors 
CF(f) (CCF(f) or LEQ(f)). 

In order to obtain the CF estimates, banks use as basic data, a set of observations 
at specific dates prior to the default time, of defaulted facilities. Most of the 
estimation methods used are based on certain statistics, related to the increase in 
the usage of the facility” between a reference date and the default date, computed 
from the former observations. One of these statistics is called the realised LEQ 
factor and is defined below. 

Consider a defaulted facility g with an exposure variable over time, given by E(f) 
and a credit limit given by L(t). Figure 11.1 presents the evolution of the exposure. 

If the facility has a default date td, given a reference date tr < td, the pair 
i = {g, tr}is called index of the observation. If EAD; stands for the observed 
exposure at default,° E(td), this can be expressed in terms of the exposure and the 
limit of the facility observed at the reference date, assuming that L(tr) 4 E(tr), as: 


EAD; = E(tr) + LEQ; - (L(tr) — E(tr)) (11.1) 


Where LEQ; is given by: 


LEQ; = as o (11.2) 


Throughout this chapter the term “usage” refers to the usage of the facility in euros (sometimes 
the terms exposure, drawn amount or utilization are used with the same meaning). 


SIn this chapter, it is assumed that a precise definition of observed EAD for defaulted facilities, 
EAD,, has been established previously and that it is applied consistently across facilities and over 
time for different internal purposes. To understand why an explicit definition of observed EAD is 
necessary see Araten and Jacobs (2001, p. 37), where two situations are cited when the simple 
definition of EAD; (“final amounts shown at the time of default”) is not adequate: charge-offs or 
seizures of collateral occurred just prior to the default date. 


204 G. Moral 


or: 


LEQ; = (11.3) 


Therefore, given an observation, O;, characterised by a pair i = {g, tr}, with L 
(tr) Æ E(tr), the former formulae can be used to compute a realised LEQ factor. We 
denote the realised LEQ factor associated with the observation O; by LEQ;, and by 
LEQ(tr) when the focus is on the reference date tr. 

There are three limitations when using this statistic as the basic input for 
estimation procedures: 


e It is not defined when L(tr) = E(tr). This implies that it is not possible to 
estimate directly EAD(f) based on the value of this statistic for facilities that 
at the current date exhibit percent usage, e(tr), equal to one.’ 

e It is not stable when L(tr) S E(tr). This means that realised LEQ factors are not 
very informative when percent usage is close to one. As shown in Sect. 11.4.2.2, 
the different behaviour of realised LEQ factors, depending on the level of credit 
percent usage at the reference date, has important practical consequences. 

e It does not take into account changes in the limit over time. In formulae 
(11.2) and (11.3) realised LEQ factors have been defined without taking into 
account possible changes in the limit of the facility between the reference 
date and the default date. As it is shown in detail in Sect. 11.4.2.3, this is 
only one of the causes that justifies the existence of realised LEQ; factors 
greater than one. 


For these reasons, banks sometimes use other statistics as their basis for estimat- 
ing EADs. For example, an obvious possibility is to define realised CCFs similarly 
to realised LEQ factors. By using an equation analogous to (11.1) the expression for 
this statistic is given by the percent exposure at default: 


= ead; (11.4) 


Although this statistic is less used in practice than LEQ; for these types of 
facilities, it has two advantages: 


This limitation applies when the estimates are used for internal purposes because, in principle, 
internal uses do not need to assume that LEQ(f) > 0, or equivalently, that the EAD(/) estimate 
has to be greater or equal than the current exposure of this facility, E(f). 

8Some banks define realised LEQ factors by using E(td)/L(td), percent usage at default, instead of 
ead; = E(td)/L(tr), percent exposure at default, in (11.3). The aim of this definition is to take into 
account changes in the credit limit after the reference date and to avoid computing realised LEQ 
factors greater than 1. It is straightforward to show that the former definition is consistent with (1) 
if EAD; is multiplied by the factor L(tr)/L(td). 
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e The realised CCF is well defined even when L(tr) = E(tr). 
e It is stable even when L(tr) S E(tr). 


Sometimes it is said that with this statistic, if the facility g had a constant limit 
L(g), it is not necessary to specify a reference date. However, as it is shown in 
Sect. 11.4, data sets for estimating procedures need to include the values of certain 
risk drivers that vary over time and therefore it is necessary to consider an explicit 
reference date. 

Additional useful statistics are introduced in Sect. 11.5; until then it is assumed 
that realised LEQs are used as the basis for the estimation process. 


11.3 How to Obtain a Set of Realised Conversion Factors 


Given a set of defaulted facilities, there are several approaches frequently employed 
by banks to obtain realised conversion factors or other statistics’ that can be used, in 
addition with other information, to obtain estimates for the EAD of non defaulted 
facilities. All these approaches are based on observations of defaulted facilities 
at specific reference dates previous to the default date. Depending on the rule used 
for selecting these reference dates we refer to these approaches as: Fixed Time 
Horizon, Cohort Approach or Variable Time Horizon. 


11.3.1 Fixed Time Horizon 


In this approach, first a time horizon, T, is selected and second, for each defaulted 
facility with L(td—T) # E(td—T), a realised LEQ factor is computed by using td—T 
as the reference date: 


E(td) — E(td — T) 
L(td — T) — E(td — T) 


LEQ(td —T) = (11.5) 


In practice, T is frequently set to 1 year (Fig. 11.2). 
Drawbacks: 


e The fixed time horizon, T, is conventional. 
e It is not possible to include directly defaulted facilities when the age of the 
facility at the date of default is less than T. 


° As is shown in Sect. 11.5, in addition to the realised CFs, the percent increase in usage between 
the reference date and the default date or the increase in exposure between those dates are statistics 
that can be used to estimate CFs or EADs. 
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Fig. 11.2 Realised LEQ with the fixed time horizon approach 


e It does not take into account all the relevant information because for each facility g 
defaulted during the observation period, only the observation {g, td — T} is used. 

e It does not take into account the possibility that current exposures can default at 
any moment during the following year. Implicitly, estimates based on this 
approach assume that the default date for each facility that will default over 
the following 12 months, will be the end of this period. This assumption could 
introduce bias into the estimates. 


Advantages: 


e Dispersion of reference dates. 
e The use of a common horizon, T = td — tr, contributes to the homogeneity of 
the realised LEQs. 


11.3.2 Cohort Method 


First, the observation period’? is divided into intervals of a fixed length (cohorts), for 
example 1-year intervals. Second, the facilities are grouped into cohorts according to 
the interval that includes their default dates. Third, in order to compute a realised 
LEQ factor associated with each facility, the starting point of the time interval that 
contains its default date is used as the reference date, {f1,f2,..., ti,..., tn}: 


LEQ(ti) = (11.6) 


This is illustrated in Fig. 11.3. 


‘The period of time covering the data is the observation period. 
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Fig. 11.3 Realised LEQ with the cohort approach 


Drawbacks: 


e The length of cohorts is conventional 

e The reference dates are conventional 

e It does not use all the relevant available information because for each facility g 
defaulted during the observation period (and included in a cohort with initial 
date t;) only the observation {g, t;} is used 

e The reference dates are concentrated 

e The realised LEQs are less homogenous than those computed by using a fixed 
time horizon. The reason is that this approach computes LEQ; factors with very 
different values for the horizon (td — tr) 


Advantages: 


e It does take into account the possibility that current exposures can default at any 
moment during the following year. 


11.3.3 Variable Time Horizon 


First, a range for horizon values (e.g., 1 year) for which we are going to compute 
LEQ; factors is fixed. Second, for each defaulted facility we compute the realised 
LEQ factors associated with a set of reference dates, (for example,'! 1 month, 
2 months, ... , 12 months before default). 


1 Although with this approach, in theory, it is not necessary to use monthly observations, from now 
on it is assumed that the reference dates are the end of each month from the first month before the 
default date (td — tr = 1) to 12 months before (td — tr = 12). This choice may be adequate for most 
of the product types and, in many cases, compatible with the information currently available in banks. 
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The rationale for this method is to take into account a broader set of possible 
default dates than in the other approaches when estimating a suitable LEQ factor for 
a non-defaulted facility conditional on the default during the following year. 


_ E(td) — E(td — j) 
N= ipd- j -E-i 


LEQ(td = 1,...,12 months (11.7) 


In principle,'” twelve realised LEQs could be associated with each defaulted 
facility (Fig. 11.4). However, these LEQ factors are clearly not homogenous in the 
sense that some of these values are computed by using observations very close to 
the default date (i = {g, td — 1}) and others are based on observations 1 year before 
default (j = {g, td — 12}). This means that it is necessary to recognise these 
differences via risk drivers. As shown in Sect. 11.4.3, the key point is to take into 
account when the bank identified the facilities as non-normal and, consequently for 
the purpose of obtaining estimates for facilities in a “normal status”, to use only 
observations meeting this requirement. The main reason is that near to default, 
borrowers are in general, classified in a non-normal internal class (in the following 
the variable that identifies these different internal classes is called “status”). This 
means that a facility is subject to close monitoring and, in general, the borrower can 
not make additional drawdowns under the same conditions as before. For example, 
in retail portfolios during the last 3 months before default, since the first impair- 
ment, it is very difficult for the borrower to make further drawdowns and, in 
general, only interest and other internal charges are allowed. Therefore, it is neces- 
sary to identify when a defaulted facility was labelled as non-normal and only use 
the realised LEQs associated with previous dates when estimating LEQ factors to 
normal facilities. In practice, for retail portfolios, at least six dates can frequently be 
used, and as a maximum, nine dates. On the other hand, for corporate portfolios, the 
status of the facilities is closely linked to the internal rating of the borrower, and 
therefore there could be cases in which the normal status applies until it is known 
that the borrower has defaulted. 

In general, it is necessary to take into account the twelve separate LEQ; factors 
associated with the same facility because the values of the risk drivers can be 
different for each reference date. 

Advantages: 


e It takes into account more observations than the previous methods. 

— Those facilities with L(tr) = E(tr), that in the previous methods were not taken 
into account, can now be used for those reference dates when L(td—i) # 
E(td—i) for some i = 1,... ,12. 

— Each facility could produce up to twelve LEQ; associated with twelve 
different observations. 


"For example, if a facility is only 4 months old when it defaults, then we will have at most four 
associated LEQ factors. 


11 EAD Estimates for Facilities with Explicit Limits 209 


Realised 
LEQs 


EEQ1 

LEQ2 
EEQ3®: 
LEQ4 


LEQI2 | 


Defaulted facilities 
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Fig. 11.4 Realised LEQs with the variable time horizon method 


e Inprinciple, estimate procedures based on these data should produce more stable 
(it uses more observations) and accurate (it uses more information) estimates. 


Drawbacks: 


e Banks have to store more data for each defaulted facility (up to twelve observa- 
tions). 

e It is necessary to use a variable (status) that contributes to identifying homoge- 
nous LEQ; factors. 


11.4 Data Sets (RDS) for Estimation Procedures 


This section discusses the ideal requirements for the reference data set (RDS) which 
includes the available information that can be used for estimation procedures. It 
focuses on those RDS based on historical information from facilities that defaulted 
over an observation period. First, it presents a general structure for this RDS that 
facilitates the implementation of estimation procedures and then it enumerates 
some fields that should be included in the RDS. Second, it lists certain scope 
requirements. Finally, it comments on several adjustments and decisions that 
have to be made before the estimation phase. 


11.4.1 Structure and Scope of the Reference Data Set 


11.4.1.1 Structure 


Given the focus on estimation procedures based on observations of defaulted 
facilities at certain reference dates, it is useful to have a structure for the reference 
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data set adapted to this approach. Consequently, the data structure should contain 
the relevant information on the basis of observations O; which have associated a 
unique pair formed by a defaulted facility, g, and a valid reference date, tr < td, 
(more specifically, the mentioned pair i = (g, tr) should be the primary key of the 
reference data set). Each of these observations, O;, includes: 


e The values of certain static characteristics of g, (g) 

e The values of a set of observable variables related to g at the reference date tr, 
that are going to be used as explanatory variables or Risk Drivers, RD(tr) 

e The observed EAD; and default date td 


In summary, a very general structure for the RDS is given by: 
RDS = {Oiz r)}; Oin=tg.m) = {(8, tr), 1(g),RD(tr), EAD; = E(td)} (11.8) 


with regard to the fields that contain the information associated with each obser- 
vation, in practical implementations, as a minimum, the following data are required: 


e Static characteristics, /(g): identifier of facility, NF; type of facility, TF; identi- 
fier of portfolio, TP; and identifier of borrower, NB 

e Risk drivers, RD: reference date, tr; default date, td; reference exposure, E(tr); 
reference limit, L(t); facility status, S(tr); and rating class or pool, R(tr) 


If other potential risk drivers for the EAD were identified, the RDS should 
contain fields for the values of these potential RD at the reference date tr. For 
example, it is worth considering the inclusion of macroeconomic indicators, MI that 
can be used to increase the forward looking character of the estimates and the 
predictive ability of the estimators. In symbols: 


RD(tr) = {E,L,S,R, td, MI, Other} a 
I(g) = (NF,NB,TF, TP, Other) i 


Risk drivers are discussed in more detail in Sect. 11.4.3. 


11.4.1.2 Scope and Other Requirements on the RDS 


In addition to a structure for the RDS suitable for the estimation procedures, the 
RDS has to meet certain internal and external requirements related to the scope of 
the RDS. 


e The scope of the RDS has to be defined without ambiguity. As a minimum, it is 
necessary: 
— To define the type of facilities, type of borrowers and type of portfolios 
— To make explicit the definition of default used and the observation period 
covered 
— To identify and describe the source (or sources) of the data 
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e The RDS should include observations for all the facilities that have defaulted 
during the observation period and meet the other scope requirements (type of 
facilities, portfolios, etc). All the exclusions should be identified and justified 

e The definition of default used should be consistent with the ones used for PD and 
LGD estimation purposes 

e The observation period should be long enough to include observations of facil- 
ities defaulted under very different general economic circumstances, ideally 
covering an entire economic cycle 

e Additionally, to use the estimates in capital requirements under AIRB approaches: 
— The definition of default should be consistent with the IRB default definition 
— The observation period should cover at least 7 years for corporate portfolios 

and five for retail portfolios 
— When necessary, the observation period should contain a period with down- 
turn conditions 


11.4.2 Data Cleaning 


As well as other more general issues related to data cleaning (identification and 
treatment of outliers, elimination of poor quality data, etc.), before to the estimation 
phase it is necessary to make certain decisions that could affect the observations 
included in the RDS. Some of these issues are analysed in the next sections. 


11.4.2.1 Treatment of Multiple Credit Facilities with a Single Obligor 


Although it is clear that realised CFs and the other relevant information included in 
the RDS are computed or observed at facility level, under certain circumstances, to 
produce sensible estimates, it could be necessary or appropriate to group together, 
within the same observation, information from different facilities associated with 
the same borrower. There are at least two situations to be considered: 


e If there are two or more observations of similar credit facilities with the same 
borrower and the same risk drivers’ values, excluding current usages and other 


values that are a function of L(t) and E(t), then it could be appropriate to group 


: ; : 13 
these observations in a new observation as ~: 


{(h, tr), E(h, tr), L(A, tr), B(h), RD(h, tr) } f 
{(¢,17), E eB RE) } ee ed 
{(h+g)} = {(h + 8,tr),E(h, tr) + E(g, tr), L(h, tr) + L(g, tr),B,RD(tr)} 
(11.10) 


!3This procedure is mentioned in Araten and Jacobs (2001, p. 36). 
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For certain portfolios and facilities, it is common for the maturity to be 1 year. 
However, in most cases, the bank approves a new facility (maybe with a 
different limit), when the old facility expires. In these circumstances, facilities 
default with age less than 12 months and therefore it is not possible to obtain 
twelve observations for the RDS. However, if this facility was approved at the 
time of the expiration of a non-defaulted facility of the same type with the same 
borrower, it could be useful to chain these facilities together. Using this proce- 
dure, more observations can be included in the RDS. 


Depending on the characteristics of the portfolio, these decisions could be made 


on a case by case basis or following a mechanical rule. 


11.4.2.2 Treatment of Observations with Negative Realised LEQ Factors!“ 


As Fig. 11.5 shows, it is possible to obtain negative realised LEQ factors associated 
with defaulted facilities. 


Arithmetically, negative realised LEQs arise when EAD; = E(td) < E(tr). This 


situation is especially frequent when td — tr is large and the credit percent usage at 
the reference date, e(tr), is close to one, moreover some of these values are very 
large in absolute value. It is very important to note that: 


The empirical distributions of realised LEQ factors conditional on the percent 
usage at the reference date, e(tr), are very different 

These empirical distributions are highly asymmetrical, especially for percent 
usage values close to one. 


L(t) 


E(tr)> EAD 


EAD= E(td) 
~~ = 


tr = td 


Fig. 11.5 Negative realised LEQ factors 


From a formal point of view, this discussion is similar to that related to realised LGDs. However, 
there are substantial differences in the reasons that justify the existence of negative realised values 
between both cases. 
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To illustrate these points, from definition (11.3) it can be seen that a small 
increment of e(tr) affects the realised LEQ factor following: 


OLEQ; — 1 —e(td) 
dele) em)? (11.11) 


Therefore, the sensitivity of realised LEQ factors to small changes in the percent 
usage at the reference date depends critically on the level of e(tr). The smaller is 
(1-e(tr))’, the larger tends to be the variability of LEQ conditional on e(tr). 

Moreover, if LEQ; is expressed in terms of a percent realised exposure at default 
ead; proportional to the percent usage at the reference date, from definition (11.3) 
the following is obtained: 


LEQ;(A) = e(tr) se e(tr) = A- di (11.12) 


and for large values of e(tr) there is no possibility of large values of A, but it is 
possible to find negative large values for A. 

The former asymmetries among LEQ; for low and large percent usage values and 
the existence of more observations with large negative LEQ; than with large 
positive values have practical importance. The main reason is that, as is shown in 
Sect. 11.5, banks frequently use averages of LEQ; as estimators for LEQ(f) and 
these sample means are severely affected by both circumstances. The former points 
suggest that, as a minimum, these averages should be restricted to those observa- 
tions with similar percent usage levels or, in other words, percent usage level should 
be a risk driver for LEQ(f). 

As a consequence, it is important to clarify the treatment of those observations 
with negative realised LEQ factors. In practice, there are several possibilities: 


G Censoring"? the data (the LEQ; factors) to impose certain restrictions: 
— Some banks change the definition of realised LEQ to force the non-negativity: 
LEQ; = max[0, LEQ; 
— In other cases, banks change the definition of the realised EAD used in LEQ; 
computations directly (observed EAD): EAD; = max[EAD,, E(tr)]. 


As discussed previously, negative LEQ; can be associated with valid obser- 
vations of defaulted facilities. To justify this practice, banks argue that, ceteris 
paribus, this adjustment introduces a conservative bias into the estimates. 


‘It is necessary to use of this terminology (censoring and truncation) carefully because these 
words are not used consistently in the literature. For example, Araten and Jacobs (2001, p. 36), 
uses the term truncation for describing what in this paper is referred to as censoring. 
The terminology employed in the text follows that used in Working Paper No. 14 BCBS 
(2005, p. 66). 
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e Truncation: this practice consists of the removal of the observations associated 
with negative LEQ; factors. It is difficult to find a rationale for the truncation of 
observations with negative or zero realised LEQs. In principle, this truncation 
could be a practical method to generate a stressed distribution of LEQ; factors. 
However, this procedure presents at least two important drawbacks: 

— The elimination of observations with LEQ; < 0 could introduce inconsisten- 
cies with the RDS used for obtaining LGD estimates because some of those 
observations could be associated with facilities with high realised losses. 

— When the estimation method uses sample averages, the LEQ estimates based 
on a truncated RDS could be very unstable with changes in the RDS depend- 
ing on the number of observations with LEQ; factors close to zero. 

e Do nothing with the realised LEQ factors (but set a floor to the estimates, !° 
LEQ(f) > 0).This is the most natural decision. 


As proved in Sect. 11.6.3.1, if the constraint on the estimators given by 
LEQ(f) > 0 is imposed and a specific model for the estimated LEQ based on 
minimising the estimation errors (measured in terms of a special loss function) is 
adjusted then the same estimates are produced by using the original or the censored 
data. 


11.4.2.3 Treatment of Observations with Realised LEQ Factors 
Greater than One 


In principle, given the definition of LEQ; factors (11.2) it would be natural to expect 
LEQ; factors to be less or equal to one in a bank with an adequate control 
environment. However, the existence of LEQ; factors greater than one is not in all 
cases an indicator of a failure in the controls established by the bank to ensure that 
credit limits are effective. There are situations in which LEQ; factors greater than 
one naturally arise. For example: 


e In some cases, banks use unadvised limits!’ instead of the nominal limits of the 
facilities to manage the risk internally. The possibility of additional drawdowns 
for the borrower only stops when the exposure is greater than the unadvised limit 

e In some products, for example credit cards or current account overdrafts, such 
problems are difficult to avoid because there is typically a time lag between the 
current exposure and the figure used by the bank to establish controls 

e Sometimes the exposure at default includes the last liquidation of interest (and 
fees) and this amount is charged to the account even when the limit had been 
previously reached. 


16As a minimum, this floor is a requirement when the estimates are used for regulatory purposes. 


Frequently, these unadvised limits are computed as a percentage or a fixed amount above the 
explicit advised limits. 
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The former excesses over the nominal limits are typically small. In these cases, it 
would be appropriate to treat these observations as any other cases. However, in 
other circumstances there are observations with large realised LEQ factors that are 
the result of several causes completely different, such as: 


e Changes in the limit after the reference date and previous to the knowledge of 
difficulties in the facilities 

e Explicit or implicit change of limit at the date of default or when difficulties with 
the facility have already arisen 

e Inadequate control environment and existence of human errors or frauds that 
could be treated as operational risk events. 


In spite of the diversity of the former circumstances, some banks cap all the 
realised EADs at one. In general, this rule is neither adequate for internal use nor for 
regulatory use and, on the contrary, a detailed analysis of the causes behind these 
observations is necessary before making acceptable decisions for each situation. 
In any case, coherence with the procedures used when calculating realised LGDs is 
a prerequisite. 


11.4.3 EAD Risk Drivers 


In practice, risk drivers (RD) affect the estimates in two different ways. First, 
certain qualitative and quantitative characteristics are used to segment the portfolio 
under analysis into homogenous classes. Among these risk drivers, different studies 
state as a minimum: 


e Facility type: the importance of this characteristic is because there is a spectrum 
of facilities with explicit limits and different conditions for drawdowns, ranging 
from facilities with unconditional limits, to facilities in which each drawdown 
requires approval. 

e Covenants: frequently the bank can deny additional drawdowns when specific 
circumstances occur. The clauses which detail these circumstances are called 
covenants. "S Typically, these covenants are related to objective situations that 
are indicators of credit deterioration of the borrower such as: downgrades, 
drops in profitability or changes in certain key financial ratios below explicit 
thresholds. ° 


Second, once we have identified a class including facilities that, in principle, are 
homogenous enough for the proposal of designing a common explanatory EAD 
model, it is necessary to select an appropriate set of explanatory (quantitative) 


'8Sometimes these clauses are called Material Adverse Changes (MAC) clauses. See Lev and 
Rayan (2004, p. 14). 
For more details on covenants, see Sufi (2005, p.5). 
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variables (risk drivers). Among these quantitative risk drivers, different studies, 
based on private data bases, suggest it is convenient to consider as a minimum: 


e Commitment size L(tr) 

e The drawn and undrawn amounts, E(tr) and L(tr) — E(tr) 

e The credit percent usage at the reference date e(tr). As discussed in 11.4.2.2, 
this percent usage value has discriminative power with regard to realised LEQ 
factors 

e The time to default td—tr: ex-post analysis shows that this variable has signifi- 
cant explanatory power, at least close to default 

e The rating class at the reference time R(tr): this variable is in general relevant, 
but different studies have found a significant positive correlation between credit 
quality and CF in some cases and a significant negative correlation in others. 
It seems that the role of the rating as a relevant risk driver is linked to the type 
of portfolio, the dynamic of each rating system and the uses of the rating for 
internal purposes 

e Status of the facility at the reference date S(tr): most banks, in addition to rating 
or scoring systems, have warning systems that focus on early identification of 
liquidity problems and other short term borrower difficulties. The basic differ- 
ence with the rating is that these warning systems are more dynamic and identify 
problems before the rating” does. As a result of these systems, certain facilities 
are classified into certain broad classes, typically: normal status and a few grades 
under special vigilance. This means that once a facility has been identified 
as linked to a problematic borrower the level of monitoring and, in some 
cases; the practical conditions for additional drawdowns are changed.”! Therefore, 
the status is a critical risk driver when estimating EAD 

e Macro indicators. 


For the observations in the RDS, the values of the above listed risk drivers are 
in general, known. For a non-defaulted facility, the values of these variables are 
computed using the current date f, as the reference date tr. With regard to the time to 
default, there is a problem because, for a non defaulted facility, the time to default is 
unknown. In the Basel II context, the interest is in EAD estimates subject to the 
condition that the facility defaults during a period of 1 year. Therefore, in this 
context, the interest is in the influence of this variable when the value ranges from 
1 to 12 months. 


°The most common relationship between these early warning systems and the ratings is that 
certain changes of status trigger the processes for a new evaluation of the borrower rating. 


218 477. “Due consideration must be paid by the bank to its specific policies and strategies adopted 
in respect of account monitoring and payment processing. The bank must also consider its ability 
and willingness to prevent further drawings in circumstances short of payment default, such as 
covenant violations or other technical default events. Banks must also have adequate systems and 
procedures in place to monitor facility amounts, current outstandings against committed lines and 
changes in outstandings per borrower and per grade. The bank must be able to monitor outstanding 
balances on a daily basis.”, BCBS (2004). 
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11.5 EAD Estimates 


11.5.1 Relationship Between Observations in the RDS 
and the Current Portfolio 


This section presents different methods of assigning a 1-year EAD estimate to a 
non-defaulted facility f at the date ż, included in the current portfolio, based on 
a subset of a RDS which comprises observations (of defaulted facilities) similar 
to fat t. We denote this subset by RDS(f). 

The process of assigning a subset of the RDS to each facility in the portfolio is 
called “mapping” and this allows the current portfolio to be classified by grouping 
facilities with the same or “similar” RDS(f). Conversely, some banks segment 
the portfolio of current exposures into classes comprising “similar” facilities. This 
approach could be reduced to the previous one because after this classification of 
exposures, each class C has to be mapped into a RDS(C) which is used to estimate 
EAD(f) for all f included in C. 


11.5.2 Equivalence Between EAD Estimates and CF Estimates 


Given a non-defaulted facility f and an estimator EAD(f), if L(f) 4 E(f), the 
estimate can be expressed in terms of a LEQ(/) factor following the equation: 


EAD(f) = E(f) + LEQ(f) - (L(f) — E(f)) (11.13) 
if LEQ(f) is given by: 


EAD(f) ~E(f) _ ead(f) — e(f) 
LEQ(f) = LAE)  1-e(f aes 


Additionally, if we are interested in EAD(/) estimates that satisfy EAD(f) > 
E(f), then from (11.13): 


© IfL(f) > ECF) then EAD(f) > E(f) if and only if LEQ(f) > 0 
© IfL(f) < ECF) then EAD(f) > E(f) if and only if LEQ(f) < 0 


Therefore, without any additional hypothesis, for facilities that verify L(f) 4 
E(f), it has been shown that to estimate EAD(f), it is sufficient to focus on methods 
that estimate suitable conversion factors LEQ(f) based on the observations 
included in the reference data set, RDS(f) and afterwards to employ (11.13) to 
assign individual EAD estimates. 

Finally, the simplest procedure to estimate a class EAD is to add the individual 
EAD estimates for all the facilities included in the class. 
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For example, for certain facility types, some banks assign EADs by using a 
previously estimated CCF(f), and then applying the formula: 


EAD(f) = CCF(f) - L(f) (11.15) 


This method is sometimes called Usage at Default Method.” If e(f) 1, this 
case can be reduced to the general method, given in (11.13), by assigning a LEQ(f) 
factor given by: 


_ CCFO): LE) — E(f) _ CCF(f) — elf) 
LF) — Eff) 1— eff) 


LEQ(f) (11.16) 


Conversely, if a LEQ(f) is available, from (11.16), an expression for an equiva- 
lent CCF(f) can be found, given by: 


CCF(f) = LEQ(f) - (1 -e(f)) + e(f) (11.17) 


Therefore, the EAD estimation method based on LEQ(f) and the one based on 
CCF(f) are equivalent, with the exception of those facilities with e(f) = 1. 

In the following sections, several methods that are normally used in practice 
by banks to estimate LEQ factors are presented from a unified perspective. This 
is used later to analyse the optimality of the different approaches. Additionally, the 
formulae most used in practice are derived as special cases of the previous methods 
when a specific functional form has been assumed for LEQ(f). 


11.5.3 Modelling Conversion Factors from the Reference 
Data Set 


This section presents several methods for estimating conversion factors based on 
regression problems starting with the following basic equation: 


EAD(f) — E(f) = LEQ(F) (LF) - E(f)) (11.18) 


These methods try to explain the observed increases in the exposure between the 
reference date and the default date and they can be grouped into three approaches 
depending on how these increases are measured: as a percentage of the available 
amount (focus on realised LEQ factors); as a percentage of the observed limit (focus 
on percent increase in usage); or finally in absolute value (focus on increase in 
exposure). 


??This method is called Momentum Method in CEBS Guidelines (2006, §§ 253 and 254). 
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Model I. Focus on realised LEQ factors 
Dividing (11.18) by L(f) — E(f), it is obtained: 


ead(f) — ef) _ EAD(f) - E(f) 
1 — eff) (LE) — Ef) 


= LEQ(f) (11.19) 


In this approach, the rationale is to determine a function of the risk drivers LEQ 
(RD) which “explains” the LEQ; factors associated with RDS(f), LEQ; = (EAD; — 
EpD/(L; — E;), in terms of LEQ(RD;). This can be made starting with an expression 
for the error associated with LEQ; — LEQ(RD;) and solving a minimisation prob- 
lem. In practice, a quadratic and symmetric error function is almost universally 
used. As a consequence of this choice, the minimisation problem to solve is given 
by (Problem P.I): 


AD; — E; 
Min > (LEQ; — Leoni} = Min 1 (o 7 LEQ(RD:)) 


Or: 


os 1 
TOD = in| Yo AD Bs —LzO(RD) L-E) (11.21) 


i 


Model II. Focus on the increase of the exposure as a percentage of the observed 
limit (focus on percent increase in usage). 
Dividing the basic (11.18) by L(f), it is obtained: 


EAD(f) — E(f) 
LF) 


Therefore, using this approach, the observable amounts to be explained are 
(EAD; — E;)/L; and the explanatory values are LEQ(RD) - (Li — E;)/L;. Following 
the same reasoning as in the previous approach, the minimisation problem to solve 
is given by (Problem P.I): 


= LEQ(f) ee (11.22) 


2 
Min B (2 — LEQ(RD;) - = — 5) | (11.23) 


LEQ Li 


t 


eer 1 
LEQ(f) = Min [5 77 (EAD; — E; — LEQ(RD;) - (Li — ey | (11.24) 
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Model III. Focus on increases in the exposure 
Directly from the basic equation, it is obtained: 


EAD(f) — E(f) = LEQ(f) - (LF) — E(f)) (11.25) 


In this case, the amounts to explain are EAD; — E; and the explanatory variable is 
LEQ(RD,) - (L; — E;). As in the other cases, the associated minimization problem is 
given by (Problem P.I): 


LEQ(f) = Min 1 (EAD; — E; — LEQ(RD)) - (Li — eo? | (11.26) 


From (11.21), (11.24) and (11.26), these problems can be reduced to a more 
general (Problem P.IV): 


2 
apy (= LEQ(RD;) - G—*)) (11.27) 


where œ; stands for L; — E; in Model I, L; in Model II, and 1 in Model II. If F* 
denotes the empirical distribution of (EAD — E)/q@ associated with the observations 
included in RDS(f), the Problem P.IV can be expressed as: 


LEO(f) = Min slg ( (== — LEQ(RD) - C-D) ) (11.28) 


In the most general case, assuming that (L — F)/q@ is constant for observations 
in RDS(f), the solution to (11.28) is given by”: 


IA _. EAD — E of) 
TEDW) ~ £(( o Jlo) 7 Lf) — Eff) oe 


As a consequence, the practical problem is to find out methods to approximate 
these conditional expectations. 
If a parametric form for LEQ is assumed, the problem becomes: 


LEO(f) = LEQ(4, b, ...), 


T iab.) VES w 


See Appendix B. 
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If the parametric functional form is linear in the parameters, the problem 
becomes a linear regression problem. 

In summary, traditional methods can be classified as regression models that 
focus on the minimization of quadratic errors in the forecasts of: LEQ; factors; 
EAD, in percentage of the limit; or EAD;. These methods produce different EAD(f) 
estimates based on LEQ(/f) estimates proportional to conditional expectations. At 
first glance, the approach that focuses directly on LEQ factors (Model I) seems the 
most natural, the method that focuses on percent increases in usage (Model II) 
seems more stable than the previous one and, as is shown in detail in Sect. 11.6, the 
approach based on EAD increases (Model III), could present advantages when the 
estimates are used in regulatory capital computations because of the link between 
capital requirements and EAD. 


11.5.4 LEQ = Constant 


11.5.4.1 Problem P.I: The Sample Mean 


In practice,” banks frequently use, as an estimator for LEQ({) at t, the sample mean 
of realised LEQ;, restricted to those observations i = {g, t} similar to {f, t, RD}. 
Assuming that the conversion factor is a constant for observations similar to {f, t}, 
LEQ(f) = LEQ, and solving the Problem P.I the following is obtained: 


a EAD; — E; . 1 «s EAD; — 2. 
TED = MSE (Fay 2) OG =; Leo, 


(11.31) 


In other cases, banks use a sample weighted mean that tries to account for a 
possible relationship between size of the exposures (or limits) and LEQ. If in 
Problem P.I a weight w; is introduced, and it is assumed that LEQ is constant for 
observations similar to {f, t}, then: 


OENES EAD; — E; z _ >o wi: LEQ; 
LEQ = Min, o wi (e = Leo) | m (11.32) 


When the reason for incorporating the weighting is to take into account a LEQ 
risk driver, this approach is inconsistent. The reason for this is that the weighted 
average is the optimum solution only after assuming that LEQ = constant, i.e. no 
risk drivers are considered. 


?4 At least this is the case in models applied by some Spanish banks at present (2006). 


222 G. Moral 
11.5.4.2 Problem P.II: The Regression Without Constant 
Another method widely used by banks is to use the regression estimator for 


the slope of the regression line based on Model II, assuming that LEQ is a constant. 
Under these conditions the expression for the regression estimator is given by: 


EAD; — E; L-E; 
LEQ = AOC — LEQ - (== L 4-8)’ 


(EAD; — E;)(Li — Ei) | 
= L; B X (ead; — e;)(1 — e;) (11.33) 


11.5.4.3 Problem P.III: Sample Weighted Mean 


If in P.II it is assumed that LEQ = constant it can be expressed as: 


Ar oy: > (EAD; — E; 2 
TE) = in| oe E;) aed Leo) | (11.34) 


And the optimum is given by: 


> Wi: LEQ; 
X wi 
Therefore, using this approach, a weighted mean naturally arises. However, it 


is worth noting that these weights (L; — E)? are different from those currently 
proposed by some banks (based on L; or E;). 


LEQ = , with w; = (Li — E;) (11.35) 


11.5.5 Usage at Default Method with CCF = Constant 
(Simplified Momentum Method) 


This method is sometimes used by banks that try to avoid the explicit use of realised 
negative LEQ factors, or for facilities for which the current usage has no predictive 
power on EADs. It estimates the EAD for a non-defaulted facility, EAD(f), by using 
(11.15) directly and a rough CCF estimate, for example, the sample mean of the 
realised CCFs computed from a set of defaulted facilities C. 


EAD(f) = CCF(C) - L(f) (11.36) 
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From (11.16) and assuming that CCF = constant, a specific functional form for 
LEQ(e(f)) is founded, given by: 


CCF - L(f) — E(f) _ CCF — elf) 
Lf)-Ef) = 1-e(f) 


LEO(f) = (11.37) 


In general, two facilities with the same estimated CCF and with different values 
for current percent usage, e(t), will have different LEQ estimates following the 
former (11.37). 

The main drawback with the procedure based on (11.36) is that experience 
shows that, in general, drawn and undrawn limits have strong explanatory power 
for the EAD. For this reason, this method (with CCF = constant) does not seem 
to meet the requirement of using all the relevant information” (because it does 
not take into account the drawn and undrawn amounts as explanatory variables 
in the EAD estimating procedure) for most of the types of facilities that arise in 
practice. 


11.6 How to Assess the Optimality of the Estimates 


To assess the optimality of the different CF estimates associated with a reference 
data set and a portfolio, it is necessary to be more precise about some elements in 
the basic problem. The first element requiring clarification is the type of estimates 
according to the role of macroeconomic risk drivers in the estimation method. The 
second element is how to measure the errors associated with the estimates and to 
motivate that particular choice. This can be done by introducing a loss function that 
specifies how the differences between the estimated values for the EAD and the 
actual values are penalised. 


11.6.1 Type of Estimates 


Focusing on the use of the macroeconomic risk drivers, the following types of 
estimates can be distinguished: 


e Point in Time estimates (PIT): these estimates are conditional on certain 
values of the macroeconomic risk drivers, for example, values close to the 
current ones. This allows the estimates to be affected by current economic 


255 476. “The criteria by which estimates of EAD are derived must be plausible and intuitive, and 
represent what the bank believes to be the material drivers of EAD. The choices must be supported 
by credible internal analysis by the bank. [...] A bank must use all relevant and material 
information in its derivation of EAD estimates. [...]”, BCBS (2004). 
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conditions and to vary over the economic cycle. In theory, this is a good 
property for the internal estimates banks need for pricing and other manage- 
ment purposes. The main problem with PIT estimates is that they are based 
on less data than long-run estimates (LR estimates, defined below) and 
therefore, in practice, they are less stable than LR estimates and harder to 
estimate. 

e Long-run estimates (LR): These are unconditional macroeconomic estimates, i.e. 
the macroeconomic risk drivers are ignored. The main advantage is that they are 
more robust and stable than PIT estimates. These LR estimates are required in 
AIRB approaches,”° except for those portfolios in which there is evidence of 
negative dependence between default rates and LEQ factors. Currently, these LR 
estimates are also used by banks for internal purposes. 

e Downturn estimates (DT): these are specific PIT estimates based on macroeco- 
nomic scenarios (downturn conditions) in which the default rates for the 
portfolio are deemed to be especially high. When there is evidence of the 
existence of adverse dependencies between default rates and conversion fac- 
tors, this could be the type of estimates that, in theory, should be used in IRB 
approaches.” In practice, the use of DT estimates is difficult because, in 
addition to the difficulties associated with PIT estimates, it is necessary to 
identify downturn conditions and to have sufficient observations in the RDS 
restricted to these scenarios. 


In the following, it is assumed that the focus is on long run estimates. 


11.6.2 A Suitable Class of Loss Functions 


The objective of this section is to determine a type of loss function that meets the 
basic requirements for the EAD estimation problem when it is necessary to obtain 
EAD estimates adequate for IRB approaches. Therefore, it makes sense to specify 
the loss associated with the difference between the estimated value and the real one 
in terms of the error in the minimum regulatory capital (computed as the difference 
between the capital requirements under both values). By using the regulatory 
formula, at the level of the facility, the loss associated with the difference between 


?6§ 475. “Advanced approach banks must assign an estimate of EAD for each facility. /t must be an 
estimate of the long-run default-weighted average EAD for similar facilities and borrowers over a 
sufficiently long period of time, [. . .] If a positive correlation can reasonably be expected between 
the default frequency and the magnitude of EAD, the EAD estimate must incorporate a larger 
margin of conservatism. Moreover, for exposures for which EAD estimates are volatile over the 
economic cycle, the bank must use EAD estimates that are appropriate for an economic downturn, 
if these are more conservative than the long-run average.”, BCBS (2004). 


?7This can be interpreted in the light of the clarification of the requirements on LGD estimates in 
Paragraph 468 of the Revised Framework, BCBS (2005a, b). 
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the capital requirement under the estimated value of the exposure K(EAD(f)) and 
the real one K(EAD), could be expressed as follows”®: 


L(AK(f)) = L(K(EAD) — K(EAD(f))) 
= L(@(PD) - LGD - (EAD — EAD(f))) = L(¢(PD) - LGD - A(EAD(f))) 
(11.38) 


Additionally, at least from a regulatory point of view, underestimating the 
capital requirement creates more problems than overestimating such a figure. For 
this reason, it is appropriate to use asymmetric loss functions that penalises more 
an underestimation of the capital requirement than an overestimation of the same 
amount. The simplest family of such functions is given by (11.39), where b > a: 


a- AK iff AK > 0 

L(AK) = or iff AK <0 (11.39) 

These loss functions quantify the level of conservatism. The larger b/a (relative 

loss associated with an underestimation of K), the larger is the level of conservatism 

imposed. For example, if a = 1 and b = 2, the loss associated with an underestima- 

tion of the capital requirement (AK < 0) is twice the loss for an overestimation of the 
same amount.”’ The graphic of the loss function is presented in Fig. 11.6. 


L=2-AK 


L=AK 


—AK 0 AK € 


Fig. 11.6 Linear asymmetric loss function 


8In the following it is assumed that a PD = PD(f) and an LGD = LGD(f) have been estimated 
previously. 

°To the best of my knowledge, the first application of such a loss function in the credit risk context 
was proposed in Moral (1996). In that paper the loss function is used to determine the optimal level 
of provisioning as a quantile of the portfolio loss distribution. 
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By using this specific type of loss function (11.39), and assuming that LGD > 0, 
a simpler expression for the error in K in terms of the error in EAD is obtained: 


L(AK(f)) = (PD) - LGD - L(A(EAD(f))) (11.40) 
The loss associated with an error in the capital requirement is proportional to the 


loss associated with the error in terms of exposure and the units of the loss are the 
same as those of the exposure (€). 


11.6.3 The Objective Function 


Once the loss function has been determined, it is necessary to find the most natural 
objective function for the estimation problem. 


11.6.3.1 Minimization at Facility Level of the Expectation in the Capital 
Requirement Error 


If the expected error in the minimum capital requirement at the level of exposure 
is used as an objective function, by using (11.40) the following is obtained: 


Min {E\L(AK(f))]} = (PD) : LGD -Min{EL(A(EAD(/)))]} (11.41) 


This means that Problem P.III in Sect. 11.5.3 arises with a different loss 
function: 


Fx 


mad E (L(EAD — E — LEQ(RD) - (L -E))} (11.42) 


or in terms of the sample 
Mi L(EAD; — E; — LEQ(RD;) ; (Li — Ei 11.43 
Taps O(RD;) > ( »| (11.43) 
and a solution is given’? by: 


1 


ag } LF- Ef)’ P 


LEQ(f) = Q (zv - E, —— 


a+b 


30See Appendix B. 


11 EAD Estimates for Facilities with Explicit Limits 227 
where Q(x, b/(a + b)) stands for a quantile of the distribution F(x) such that?! 
F(Q) = b/(a + b). When a = b, the loss function (11.39) is symmetric and the 
former quantile is the median and for values of b/a > 1 the associated quantile is 
placed to the right of the median and, therefore, a more conservative estimate of 
LEQ(f) is obtained. It is interesting to note that (11.44), with b > a, penalises 
uncertainty.** 

An important consequence of using the former loss function L is that the 
problems M.I and M.II described in (11.45) and (11.46) are equivalent.” 

Problem M.I: 


Min < X L(EAD; — E; — LEQ(RD;) - (Li — E;)) 


LEQ (11.45) 
Subject to: 0 < LEQ(RD) < 1 
Problem M.II: 
Min L(Min|Max EAD;, Ei Li —E; — LE RD; : Li — Ei 
Min q J L(Min[Max]EAD;, ELi Q(RD))-Li- ED) Fy ag 


Subject to: 0 < LEQ(RD) < 1 


This means that an estimator meeting the constraint 0 < LEQ(f) < 1 that is 
optimal when using the original data is also optimal when using data censored to 
show realised LEQ factors between zero and one. 


11.6.3.2 Minimization of the Error in the Capital Requirement at Facility 
Level for Regulatory Classes 


Sometimes, in spite of the existence of internal estimates for LEQ factors at 
facility level, it could be necessary to associate a common LEQ with all the 
facilities included in a class comprising facilities with different values for the 
internal risk drivers. This could occur due to difficulties in demonstrating with 


3'In practice, it is necessary to be more precise when defining a g-quantile because the distribution 
F(x) is discrete. A common definition is: a “q-quantile” of F(x) is a real number, Q(x,q), that 
satisfies P[X < Q(x,q)] > q and P[X > Q(x,q)] > 1—q. In general, with this definition there is 
more than a q-quantile. 

328 475. “Advanced approach banks must assign an estimate of EAD for each facility. It must be an 
estimate [...] with a margin of conservatism appropriate to the likely range of errors in the 
estimate.”, BCBS (2004). 


33The proof follows from the proposition in Appendix A. 
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the available data, that discrimination at an internal level of granularity is justi- 
fied. In this case, for regulatory use, it is necessary to adopt a less granular 
structure for the risk drivers than the existing internal one. Therefore, the problem 
of finding an optimal estimator for regulatory use can be solved by using the 
regulatory structure for the risk drivers. In other words, the procedure is to 
compute new estimates using the same method and a less granular risk driver 
structure. In general, the new estimator is not a simple or weighted average of the 
former more granular estimates. 


11.7 Example 1 


This example™ illustrates the pros and cons of using the methods explained in the 
former sections for estimating LEQ factors and EADs. The focus is on long run 
estimates for the EAD of a facility f in normal status by using as basic risk drivers 
the current limit L(f) and exposure E(f). 


11.7.1 RDS 


11.7.1.1 Characteristics 


The main characteristics of the reference data set, used in this example, are 
described below: 


e Source of the RDS: the observations were obtained from a set of defaulted 
facilities from a portfolio of SMEs 

e Observation period: 5 years 

e Product types: credit lines with a committed limit of credit, that is known for the 
borrower, given by L(t) 

e Exclusions: It does not include all the internal defaults which took place during 
the observation period because several filters had been applied previously. As a 
minimum, the following facilities were excluded from the data set: 

— defaulted facilities with L(td—12) < E(td—12) and 
— those with less than 12 monthly observations before the default date 

e Number of observations, O;: #RDS = 417-12 = 5,004 observations, which are 
associated with 417 defaulted facilities and dates 1, 2,...,12 months before the 
default date 


4Although this example could be representative for certain SME portfolios comprising credit 
lines, it is not a portfolio taken from a bank. 
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e Structure of the reference data set: the structure proposed in (11.8) but, for 
simplicity, only a basic set of risk drivers is considered: 


O; = {i, (f, tr), RD; = {L(tr), E(tr), S(tr)}, EAD, td, tr} (11.47) 


e Status of a facility at the reference date, S(t): there is no information about the 
status of the facilities. The bank has implemented a warning system that classi- 
fies the exposures on four broad classes: N = normal monitoring and controls; 
V = under close monitoring for drawdowns; J = current exposure greater than 
the limit and implies tight controls making additional drawdowns impossible 
without a previous approval; D = defaulted, no additional drawdowns are 
possible, but sometimes there are increases in the exposures due to the payment 
of interest and fees. However, in this example, in order to take into account the 
status, S(tr), as a risk driver, observations with S(t’) = N are approximated using 
the following procedure: 

— First, all the observations with L(tr) < E(tr) are marked as in a non-normal 
status 

— Second, after analysing the empirical distributions of realised LEQ factors 
(and other information) it was decided to consider all the observations with td 
— tr less than 5 months as if they were in a non-normal status and to eliminate 
all the observations with td — tr = 7 months (see next section). 


In practice, the use of the values of the variable status is necessary, because 
the early identification of problematic borrowers and the subsequent changes in the 
availability of access to the nominal limit have important consequences in the 
observed EADs. For this reason, observations up to 5 months before default for 
which E(tr) < L(tr) are considered in normal status. In this case, the number of 
observations with S(tr) = N is: #RDS(N) = 2,919. 


11.7.1.2 Empirical Distributions of Certain Statistics 
Distribution of Realised LEQ Factors 


Figure 11.7 summarises the empirical variability of the realised LEQ factors 
associated with 2,442 observations for which it is possible to compute this 
statistic.” 

It shows that the distribution is asymmetric with a high number of observations 
outside of [0,1] which is the natural range for LEQ factors. The sample mean is 
about —525 due to the existence of many observations with large negative values 


35Observations associated with, the horizon value, td — tr = 7 were removed from the RDS as it is 
explained later on. 
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Fig. 11.7 Histogram of realised LEQ factors 


and it highlights one of the main issues when a sample mean is used as the 
estimator. The median is 0.97 and this value, in contrast with the former sample 
mean value, highlights the advantages of using statistics less dependent on 
the extreme values of the distribution for estimation purposes. 


Joint Distribution of Realised LEQ Factors and Percent Usage 
at the Reference Date 


To reduce the variability in the observed realised LEQ factors, it is necessary to 
consider a variable that exhibits explanatory power, at least, for the range of 
values of realised LEQ factors. For example, the joint empirical distribution 
presented in Fig. 11.8 shows that the variable percent usage at the reference 
date is important for limiting the variability of realised LEQ factors. Black 
points at the top of Fig. 11.8 represent the observations in the space {1 — e(tr), 
LEQ;}. 


Influence of td — tr in the Basic Statistics 


Figure 11.9 presents the empirical distributions of realised LEQs associated with 
a fixed distance in months between the default and reference dates for td — 
aa A 
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Fig. 11.8 Joint distribution of LEQ; and percent usage at the reference date tr 
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Fig. 11.9 Empirical distributions of LEQ; conditional on different td — tr values 
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Fig. 11.10 Empirical distributions of percent increase in usage since the reference date 
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Fig. 11.11 Empirical distributions of the increase in exposure from tr to td 


The distributions associated with td — tr = 1, 2, 3, 4 are very different from 
the others. The distribution conditional on td — tr = 7 months is totally anomalous 
and the reason for that is an error in the processes that generated these data. 

Figure 11.10 presents the empirical distributions of the percent increase in usage 
between the reference and the default dates, ead; — e(tr), associated with a fixed 
distance in months between the default and reference dates for td — tr = 1,...,12. 
Again, the differences among the distributions conditional on reference dates near 
to default and far from default are obvious and the existence of anomalous values 
for the case td — tr = 7 is evident. 

Finally, Fig. 11.11 shows the empirical distributions of the increase in exposure, 
EAD; — E(tr), between the reference and the default dates. 
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11.7.2 Estimation Procedures 


11.7.2.1 Model II 
Original Data and Fixed Time Horizon 
Some banks use Model II assuming a constant LEQ, and a fixed time horizon 


approach, T = 12 months. This means that they adjust a linear regression model 
without an independent term, given by: 


EAD;  E(td-12) _ E(td — 12) 
L(td—12) L(td—12) =e (1 a) (11.48) 


Therefore, in these cases, the bank’s approach focuses on the minimisation of the 
quadratic error in the increase of the exposure expressed in percentage terms of the 
limit. The results with this method are summarised below: 

By using the original data, the estimated LEQ factor is LEQ = 0.637 and the 
adjusted R is 0.13. Therefore, the final estimate for the EAD of a facility, f, in 
normal status is given by the formula: 


EAD(f) = E(t) + 0.637 - (L(t) — E(t)) (11.49) 


Figure 11.12 presents, for each observation in the RDS(td—12), the values of 
the pairs {1 — e(td—12), ead; — e(td—12)}. The upper shadow zone in 
Figs. 11.12—11.14 are associated with points with LEQ; > 1. 

From analysis of the distribution of these points and the results of the regression 
it is clear that, at least: 


eadj-e(tr) 


1-e(tr) 


Fig. 11.12 Percent increase in usage from tr, to td and percent usage at the reference date 
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Fig. 11.13 Linear regression in Model II and censored data 
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Fig. 11.14 Linear regression in Model II and variable time approach 


1. It is necessary to carry out an explicit RDS cleaning process before the estima- 
tion phase. For example, it is necessary to analyse the observations associated 
with the points above the line y = x and afterwards to make decisions about 
which observations have to be removed from the RDS. 

2. The degree of adjustment is very low. Most of the points (those with 1— e(tr) 
closer to zero) have little influence on the result of the regression model because 
of the constraint that there is no independent term. 

3. In order to assess the reliability of the estimated LEQ it is necessary to identify 
outliers and influential observations and to perform stability tests. In this case, 
given the functional form of the model, y = k - x, and the low number of points 
associated with large values of 1 — e(tr), these observations are influential 
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points.” It is easy to understand that changes in these points affect the result of 
the regression and therefore the LEQ estimate. 

4. In order to get more stable results, it is necessary to get more observations (for 
example by using a variable time horizon approach). 


Censored Data and Fixed Time Horizon 


Sometimes banks use censored data to force realised LEQ factors to satisfy the 
constraint 0 < LEQ; < 1. Using censored data, the estimated LEQ factor is 0.7 and 
the R? increase to 0.75. In this case, all the points are in the white triangular region 
of Fig. 11.13 and it is clear that the existence of very influential points (those with 
large values of 1 — e(r)) introduces instability. Figure 11.13 presents the censored 
observations and the regression line. 

The EAD estimator is in this case: 


EAD(f) = E(t) + 0.7 - (L(t) — E(t) (11.50) 


Original Data and Variable Time Approach 


By using a variable time approach, based on observations with tr = td — {12, 11, 
10, 9, 8}, the estimated LEQ factor is LEQ = 0.49 and the R? is 0.06. Figure 11.14 
presents, for each observation in the RDS, the pairs {1 — e(tr), ead; — e(tr)} and the 
regression line associated with this extended data set and Model II. 

In Model II, the use of a time variable approach does not increase the degree of 
adjustment (which is very low due to the functional form assumed in the model), 
but increases the stability of the results. 

The EAD estimator in this case is: 


EAD(f) = E(t) + 0.49 - (L(t) — E(t) (11.51) 


11.7.2.2 The Sample Mean and the Conditional Sample Mean 


If Model I is used and a constant LEQ for facilities “similar” to f is assumed, an 
estimate for EAD(/) is obtained by computing the sample mean of the realised LEQ 
conditional on observations in RDS(f) as the LEQ(f) estimate and then applying 
(11.13). With regard to RDS(f), in this example, two possibilities are analysed: 


3°Influential points have a significant impact on the slope of the regression line which, in Model II, 
is precisely the LEQ estimate. 
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e RDS(f) = RDS or equivalently to use a global sample mean as estimator. 
e RDS(f) = {O; such as percent usage e; is similar to e(f)} or equivalently to use 
as estimator a function based on different local means depending on e(f). 


Case RDS(f) = RDS, Use of a Global Sample Mean 


If the sample mean of all the realised LEQ factors associated with the observations 
in the RDS is computed, the result is a nonsensical figure: 


LEQ(f) = LEQ = I X LEQ; = —578 (11.52) 


The problems that arise when using this global average are due to: 


1. Instability of certain realised LEQ factors: when 1 — E(f)/L(f) is small the 
realised LEQs are not informative. 
2. Very high values for certain observations, in some cases several times L(tr) — 
E(tr). The treatment of these observations needs a case by case analysis. 
. Asymmetries in the behaviour of positive and negative realised LEQ factors. 
4. Evidence of a non-constant LEQ; sample mean depending on the values of 


1 — EYL). 


Figure 11.15 represents the distribution of the realised LEQ factors and undrawn 
amounts as a percentage of the limit, 1 — E(f)/L(f) and it can help to increase 
understanding of the main problems associated with this method: 

Figure 11.16 focuses on observations associated with values of realised LEQ 
factors less than 2. It is clear that there are observation realised LEQ factors greater 
than one, (upper shadow zones in Figs. 11.16 and 11.17) across the range of percent 
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Fig. 11.15 Realised LEQ factors and percent usage at the reference date 
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Fig. 11.16 Realised LEQ factors smaller than two 
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Fig. 11.17 Approximation for E[LEQ/1— e(tr)] and the adjusted regression curve 


usage values, although such observations are much more common when the percent 
usage values are large (small values of 1— e(tr)). 

For these reasons, before using this procedure, it is necessary to make some 
decisions after analysing the observations in the RDS, for example: 


e To eliminate from the RDS those anomalous observations with large LEQ; factors 

e To censor other observations associated with LEQ; factors greater than one 

e To remove observations with very low values of E(f) — L(f) from the RDS, 
because their LEQ; values are not informative. 


In this example, observations with 1 — E(tr)/L(tr) < 0.1 and those with 
LEQ; > 2 were removed from the reference data set. After these modifications 
of the RDS, the new LEQ; sample mean is: 
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1 
LEQ(f) = LEQ = — X ` LEQ; = 0.08 (11.53) 
i“ i 


It is clear that this global estimate of 8% is very low for most of the facilities in 
the portfolio because of the weight in the global average of the negative realised 
LEQ factors associated with observations with low values of 1— e(/). 

An improvement to the former estimate is to eliminate outliers, i.e. observations 
associated with very large (in absolute terms) realised LEQ factors. If observations 
with LEQ factors below the tenth percentile and above the ninetieth are considered 
outliers, the average restricted to the RDS without outliers is about 33% and this 
value is stable when the former percentiles are changed. 


LEQ(f) = LEQ = : J LEQ: = 0.33 (11.54) 


However, it is clear that local averages are very different and therefore this global 
estimate of 33% for the LEQ is not adequate. For this reason, it is necessary to 
consider different estimates for the LEQ factor for different values of 1— E(f)/L(/). 


Case RDS(f) = {O; Such as Percent Usage e; is Similar to e(f)} 


In this case, the RDS(f) comprises all the observations O; with 1— e(tr) € [1— e(f) 

0.2, 1— e(f) + 0.2] and the average of the realised LEQ factors restricted to 
observations in the RDS(f) is used as the estimate of LEQ(/). To select a functional 
form for LEQ(f), first the estimated values for different 1 — e(tr) values are 
computed and second, a regression model is adjusted using 1 — e(tr) as the 
explanatory variable, and the local sample mean as the dependent variable. After 
rejecting different models and using intervals of width 0.4 an expression for the 
“local’”*’ sample mean of LEQ factors based on a + b - \/(1 — e(tr)) is obtained as: 


LEO(f) = —0.82 + 1.49- \/1 — E(f)/L(f) (11.55) 


with an adjusted R? equal to 0.94. Figure 11.17 represents the realised LEQ 
factors, the local averages and the adjusted function (with the constraint LEQ 


(f) 2 0). 


Therefore an estimator for EAD(f) of a facility f in normal status is given by: 


EAD(f) = E(f) + Max|0, (—0.82 + 1.49- YT- EQ)/LQ)) - (LP) - EU) 
(11.56) 


37The “local” condition is to consider only those observations in an interval centred on 1—E(f/L 
(f) and with length 0.4. 
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11.7.2.3 The Median and the Conditional Quantiles 


The rationale under Model III is to explain directly the increase in exposure 
from the reference date to the default date. Therefore, it is necessary to explain 
EAD, — E(tr) in terms of LEQ(RD;) - (L(tr) — E(tr)). For simplicity, it is assumed 
that RD; = {S(tr), L(tr) — E(tr)} and the focus is on observations with status S(tr) = 
“normal” and the only quantitative variable that is taken into account is the current 
undrawn amount L(f) — E(f). Moreover, the loss function proposed in (11.39) is 
used to determine the optimal estimates and therefore as shown in Sect. 11.6.3.1, 
the solution is to approximate the quantile Q[b/(a + b)] of the distribution of EAD; — 
E(tr) conditional on those observations which satisfy L(tr) — E(tr) = L(f) — E(f). 
To approximate that quantile for each value of EAD(f) — E(f), the process is 
similar to the one explained in the previous section. First, RDS(f) is defined as all 
the observations such as (L(tr) — E(tr)) € [(L(f) — E(f) - 0.8, (LC) — ECf)) - 1.2]. 
Second, for each value of L(tr) — E(tr) the optimal quantile is computed. Third, a 
linear regression model that uses L(t) — E(tr) as the explanatory variable and the 
optimal quantile as the dependent variable is adjusted and, finally, the estimator for 
LEQ(f) is obtained by using formula (11.44). 

Figure 11.18 represents, for each observation in the RDS with tr = td — {/2, 11, 
10,9, 8}, the pairs {L(tr) — E(tr), EAD; — E(tr)} in the range of values of L(tr) — E(tr) 
given by [0, 17000]€, for which it is considered there exists sufficient number of 
observations. The shadow zones in Figs. 11.18 and 11.19 are defined as EAD; > L(tr). 

The results of the regression model for the local medians (case a = b) and for 
the 66.6th percentile (case 2 - a = b) produces the following results: 
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Fig. 11.18 Observations in Model III 
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Fig. 11.19 Quantiles conditional on the undrawn amount and adjusted EAD — E(tr) values 


Median|EAD( f) — E(f)] = 86.8 + 0.76 - (L(f) — E(f)) 


QuantileEAD(f) — E(f), 0.666] = 3378 +0.92-(L(f)-E(f)) 1P 


With adjusted R? equal to 0.95 and 0.99 respectively. Therefore, the associated 
LEQ estimates, obtained dividing (11.57) by L(f) — E(f), are almost constant (close 
to 0.76 and 0.92 respectively) and have values larger than the previous estimates. 

Figure 11.19 represents the local medians (Qso% line) and local 66.6 percentiles 
(Qe6% line) obtained from the original points, the regression lines associated with 
(11.57) (dotted line for the adjusted 66.6 percentiles, thick line for the adjusted local 
medians). 


11.8 Summary and Conclusions 


The following points summarise the current practice on CF and EAD estimates and 
highlight some problematic aspects: 


e The CF and EAD estimators applied by banks can be derived from special cases 
of regression problems, and therefore these estimators are based on conditional 
expectations 

e Implicitly, the use of these estimators assumes the minimisation of prediction 
errors by using a quadratic and symmetric loss function that is neither directly 
correlated with the errors in terms of minimum capital requirements nor 
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penalises uncertainty. The way in which these errors are measured is crucial 
because they are very large 

In most of the cases, the EAD estimates are based on the unrealistic assumption 
of a constant LEQ factor mean 

Frequently, the basic statistics for the estimation process are censored to obtain 
realised LEQ factors between zero and one 

Banks frequently use “Cohort Approaches” or “Fixed Time Horizon Approaches” 
to select the observations included in the estimation process. These approaches do 
not take into account all the relevant information because they only focus on a 
conventional reference date for each defaulted facility 

With regard to risk drivers, the focus is on the rating at the reference date. 


Other approaches and some comments on different aspects: 


For regulatory use, it seems appropriate for the estimators to be solutions to 

optimisation problems that use a loss function directly related with errors in 

terms of capital requirements 

For example, a logical choice is to use a simple linear asymmetric loss function 

applied at the level of facility. This loss function enables banks or supervisors to 

quantify the level of conservatism implicit in the estimates 

Using this loss function, the derived estimators are based on conditional quan- 

tiles (for example, the median for internal purposes and a more conservative 

quantile for regulatory use) 

If the estimates are based on sample means LEQ factors, as a minimum, should 

depend on the level of the existing availability of additional drawdowns: 

LEQ(1 — e(tr)) 

The common practice of censoring the realised LEQ factors to [0, 1], is not 

justified and, in general, it is not possible to conclude ex ante if the associated 

LEQ estimates are biased in a conservative manner 

However, under certain hypotheses, the use of censored data does not change the 

optimal estimator for LEQ 

The estimates should be based on observations at all the relevant reference dates 

for defaulted facilities, “Variable Time Approach” 

With regard to risk drivers, if there is a warning system for the portfolio, it is 

important to focus on the status of the facility at the reference date rather than on 

the rating 

The example presented here suggests that: 

— Estimates based on sample means are less conservative than those based on 
conditional quantiles above the median 

— The CF estimates obtained by using these conditional quantiles, are so large 
that the use of downturn estimates in this case might not be a priority. 
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Appendix A. Equivalence Between Two Minimisation Problems 


Proposition: Consider a set of observations O = {(xi, Yi) }i-1,_, and the problem 
G.I given by: 


Minimisegeg pa L(yi — g(xi)) 


(11.58) 
Subject to f(x) > g(x) > h(x) 
where the error is measured in terms of the function L that satisfies: 
L(x +y) =L(x)+ L(y) if x-y>0 (11.59) 


then, g is a solution of Problem G.I if and only if it is a solution of Problem G.II 
given by: 


Minimisegeg boa _L(Min|Maxy;, h aD Fæ] — ex) 
Subjectto f(x) > g(x) > h(x) 


(11.60) 


Proof: The set O can be partitioned into three classes O = O* [[ O7 [| OF, where: 


O% = { (x5 Y) Vf a}, OF = {y yi <ha) (11.61) 
For observations in O*: 
(yi — fi) (Fa) — g@i)) = 0 (11.62) 


Therefore, from (11.59) and (11.62), the error in Problem G.I associated with an 
observation in O* can be expressed in terms of the error in Problem G.II plus an 
amount independent of g: 


err|GI, (xi, yi)] = LO — g(ai)) = LO’ — fxi) + £20) — g(%)) 
= L(y; — fi) + LF) — 8) 
= L(y; — f (xi)) + L(Min|Maxly;, h(xi)], Fæ] — gi) 
= L(y; — f(a;)) + err[GI, (xi, yi)] (11.63) 


But the O* set does not depend on the functiong, therefore for these observa- 
tions, and for all g, the error in Problem G.I can be decomposed in a fixed amount, 
independent of the g function, given by X` L(y; — f(x;)), where the index i applies 
at the observations in O~ and the error in Problem G.II. 

Similarly, for observations in O` , the error in Problem G.I is equal to the error in 
Problem G.II plus the fixed amount X` L(A(x;) — y;). 

Finally, for the observations in O~ the errors in Problem G.I and in Problem G.II 
are the same. 
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Appendix B. Optimal Solutions of Certain Regression 
and Optimization Problems 


Let X and Y be random variables with joint distribution given by F(x,y), then we get 
in the case of a quadratic loss function 


d(x) = E(Y|X) = Min fg (v -aP)} (11.64) 
In the case of the linear asymmetric loss function, with a > 0 and b > 0: 
o Jax iff x > 0 
L(x) = { eh eco (11.65) 
The following is found 
d*(x) = Q( YIX = Mins E (L(Y — d(X))) (11.66) 
oe "a+b ~ GG) Fx ` 


See, for example, Pratt et al. (1995, pp. 261-263). 
Therefore, a solution for (11.28) can be obtained from (11.64), and taking into 
account: 


EAD — E L-E 
= z d(X = RD) = LEQ(RD) - h(RD); where h(RD) = 5 


(11.67) 


Y 


Then, d* is given by (11.64) and assuming that A(RD) = h(f) for observations in 
RDS(f): 


d*(X = RD(f)) = (a= iD) - reo RD f)) LOE 


11.68 
agy 0 


The result showed in (11.29) is obtained from the former equation. 
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Appendix C. Diagnostics of Regressions Models 


Model II (Sect. 11.7.2.1) 


e By using original data: 


EAD; E(td — 12) E(td — 12) 
= 0.64- | 1 — ———__ 11.69 
L(td— 12) L(td — 12) ( L(td — 12) ( ) 
e By using censored data: 
EAD; E(td — 12) E(td — 12) 
=0.7- ——— (11.70) 
L(td— 12) L(td — 12) L(td — 12) 


e By using a variable time approach: 


eek a — 0.49. (1 -50 (11.71) 


Model I (Sect. 11.7.2.2) 


e By using Model I, variable time approach: 


LEQ(f) = —0.82 + 1.49 - /1 —E(f)/L(f) (11.72) 


The diagnostics for this regression model are: 
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Model III (Sect. 11.7.2.3) 


e By using a variable time approach: 


Median|EAD(f) — E(f)] = 86.8 + 0.76 - (L(f) — E(f)) 


(11.73) 


Quantile[EAD( f) — E(f), 0.666] = 337.8 + 0.92 - (L(f) — E(f)) 


With the diagnostics given by: 


and for the quantile: 


Appendix D. Abbreviations 


AIRB 

CCF 

CF 

EAD 

EAD, = E(td) 
EAD(f) 

ead; 

E(t) 

e(t) 


e; = e(tr) 


E 

i = {g, tr} 
IRB 

LEQ 
LEQ(f) 
LEQ; 
LGD 

LA 

0; 

PD 

Qa = QQ, a) 
RDS 
RDS(f) 
RD 

S(tr) 


Advanced internal ratings-based approach 

Credit conversion factor 

Conversion factor 

Exposure at default 

Realised exposure at default associated with O; 

EAD estimate for f 

Realised percent exposure at default, associated with O; 
Usage or exposure of a facility at the date t 

Percent usage of a facility at the date t 

Percent usage associated with the observation Oj=1¢, tr} 
Non-defaulted facility 

Defaulted facility 

Index associated with the observation of g at tr 
Internal ratings-based approach 

Loan equivalent exposure 

LEQ estimate for f 

Realised LEQ factor associated with the observation O; 
Loss given default 

Limit of the credit facility at the date t 

Observation associated with the pair i = {g, tr} 
Probability of default 

Quantile associated with the a% of the distribution F(x) 
Reference data set 

RDS associated with f 

Risk drivers 

Status of a facility at the reference date tr 
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t Current date 
td Default date 

tr Reference date 
td — tr Horizon 
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Chapter 12 
Validation of Banks’ Internal Rating Systems: 
A Supervisory Perspective 


Stefan Blochwitz and Stefan Hohl 


12.1 Basel II and Validating IRB Systems 


12.1.1 Basel’s New Framework (Basel II and Further Work) 


“Basel II and further work” is associated with the work undertaken by the Basel 
Committee on Banking Supervision (BCBS).' This aimed to secure international 
convergence on revisions to supervisory regulations on capital adequacy standards 
of internationally active banks. The main objective of the 1988 Accord? and its 
revision is to develop a risk-based capital framework that strengthens and stabilises 
the banking system. At the same time, it should provide for sufficient consistency 
on capital adequacy regulation across countries in order to minimize competitive 
inequality among international banks. In June 2004, the BCBS issued “Basel IT’, 
titled “International Convergence of Capital Measurement and Capital Standards: 


'The Basel Committee on Banking Supervision is a committee of banking supervisory authorities 
that was established by the central bank governors of the Group of Ten countries in 1975. Up to 
2009 it consisted of senior representatives of bank supervisory authorities and central banks from 
Belgium, Canada, France, Germany, Italy, Japan, Luxembourg, the Netherlands, Spain, Sweden, 
Switzerland, the United Kingdom, and the United States. The membership of the Basel Committee 
on Banking Supervision was broadened in June 2009. The new members are representatives from 
the G20 countries that were not in the Basel Committee before. These are Argentina, Indonesia, 
Saudi Arabia, South Africa and Turkey. In addition, Hong Kong SAR and Singapore had also been 
invited to become BCBS members. 


?see Basel Committee on Banking Supervision (1988). 
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A Revised Framework”, carefully crafting the balance between convergence and 
differences in capital requirements. 

In December 2009 the enlarged BCBS issued a consultative document titled 
“Strengthening the Resilience of the Banking Sector” as part of its reform package 
to address lessons from the financial crisis starting in 2007. The proposals aim at 
strengthening global capital and liquidity regulations to promote a more resilient 
banking sector in the future. Accordingly, enhancing risk coverage as well as 
reducing procyclical amplification of financial shocks throughout the financial 
system are among its key objectives. Both may have an impact on issues related 
to validation of bank’s internal risk management systems. 

For example, one of the proposed future requirements is the use of stressed inputs 
for determining bank’s capital requirement for counterparty credit risk. The cycli- 
cality in minimum capital requirements over time has always been a key consider- 
ation for the BCBS during the design of the Basel II framework. As such, the BCBS 
had introduced a number of safeguards to address this issue including the requirement 
to use long term data horizons to estimate probabilities of default (PD) and the 
introduction of downturn loss-given-default (LGD) estimates. 

This paper presents pragmatic views on validating IRB systems. It discusses 
issues related to the challenges facing supervisors and banks of validating the 
systems that generate inputs into the internal ratings-based approach (IRBA) used 
to calculate the minimum regulatory capital for credit risk, based on internal bank 
information. 

The key role of Banks as financial intermediaries highlights their core compe- 
tences as lending, investing and risk management. In particular, analysing and 
quantifying risks is a vital part of efficient bank management. An appropriate 
corporate structure is vital to successful risk management. Active credit risk 
management is indispensable for efficiently steering a bank through the economic 
and financial cycles, despite the difficulties stemming from a lack of credit risk data. 

A well-functioning credit risk measurement system is the key element in every 
bank’s internal risk management process. It is interesting to note that the debate 
about the new version of the Basel Capital Accord (Basel II and further work), 
which establishes the international minimum requirements for capital to be held by 
banks, has moved this topic back to the centre of the discussion about sound 
banking. The proper implementation of the IRBA is one key aspect of a lively 
debate among bankers, academics and regulators. At the same time a paradigm shift 
in credit risk management has taken place. 

Previously, credit risk assessment used only the experience, intuition and powers 
of discernment of a few select specialists. The new process is more formalised, 
standardised and much more objective by bank’s internal rating systems. The 
human element has not been entirely discounted, however; now both human 
judgement and rating systems each play an equally important role in deciding the 
credit risk of a loan. 

Since the IRBA approach has been implemented in most of the G10-countries in 
the past and will be implemented in almost all G20-countries in the near future, the 
debate on the IRBA has shifted its accent. More emphasis is now given to the 
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problem of validating a rating system, rather than how to design a rating system. 
Both the private sector and banking supervisors need well-functioning rating 
systems. This overlap of interests and objectives is reflected in the approach 
towards validation of rating systems; even if different objectives imply different 
priorities in qualifying and monitoring the plausibility of such systems. 

We will discuss some of the challenges faced by banks and supervisors, aware that 
we have only scratched the surface. This is followed by a discussion of some of the 
responses given by the BCBS. We then will discuss a pragmatic approach towards 
validating IRB systems while touching on some issues previously raised. However, we 
would like to stress that implementation of Basel II, and especially the validation of IRB 
systems (and similarly AMA models for operational risk) requires ongoing dialogue 
between supervisors and banks. This article, including its limitations, offers a concep- 
tual starting point to deal with the issues related to the validation of IRB systems. 


12.1.2 Some Challenges 


The discussion on validation has to start with a discussion of the structure and usage 
of internal rating systems within banks. The two-dimensional risk assessment for 
credit risk as required in Basel II, aims to quantify borrower risk, via the probability 
of default (PD) for a rating grade, and the facility risk by means of the Loss Given 
default (LGD). The third dimension is the facility’s exposure at default (EAD). 

The broad structure of a bank-internal rating system is shown in Fig. 12.1. First, 
the information, i.e. the raw data on the borrower to be rated have to be collected in 
accordance with established banking standards. Accordingly, the data is used to 
determine the potential borrower’s credit risk. In most cases, a quantitative rating 
method which draws on the bank’s previous experience with credit defaults is 
initially used to determine a credit score. Borrowers with broadly similar credit 
scores, reflecting similar risk characteristics, are typically allocated to a preliminary 
risk category, i.e. rating grade. Usually, a loan officer then decides the borrower’s 
final rating and risk category, i.e. this stage involves the application of qualitative 
information. 

A well-working rating system should demonstrate that the risk categories differ 
in terms of their risk content. The quantification of risk parameters is based on the 
bank’s own historical experience, backed by other public information and to certain 
extent, private information. For example, the share of borrowers in a given rating 
category who have experienced an occurrence defined as a credit default? within a 


3What constitutes credit default is a matter of definition. For banks, this tends to be the occurrence 
of an individual value adjustment, whereas at rating agencies, it is insolvency or evident payment 
difficulties. The IRBA included in the new Basel Capital Accord is based on an established 
definition of default. Compared with individual value adjustments, the Basel definition of default 
provides for a forward-looking and therefore relatively early warning of default together with 
a retrospective flagging of payments that are 90 days overdue. 
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Fig. 12.1 Schematic evolution of a rating process and its integration in the bank as a whole 


given time-frame, usually 1 year, will be used for the estimation process. The 
described standardisation of ratings allows the use of quantitative models where 
sufficient borrower data is available and highlights the need for high-quality, informa- 
tive data. 

For consumer loans, the BCBS also allows risk assessment on the level of 
homogenous retail portfolios that are managed accordingly by the bank on a pool 
basis. The challenge for banks is to identify such homogenous portfolios exhibiting 
similar risk characteristics. This leads to the importance of using bank-internal data, 
which plays a crucial role in both the segmentation process used to find homogenous 
portfolios, and the quantification process used for the risk parameters. One of the 
techniques used for segmentation and quantification is the utilisation of so-called 
“roll rates”, where different delinquency stages are defined (for example 30 days, 
60 days etc.). Counting the roll rate from one delinquency stage to another and filling 
the migration matrix would serve as a basis for estimating the PDs for those 
exposures. 

There are a couple of issues related to this procedure. Firstly, there is the issue of 
segmentation, i.e. do roll rates take into account all relevant risk drivers as required 
in the Basel II framework? Secondly, for quantification purposes, how will roll rates 
be translated into PDs, more specifically, which delinquency class should be used 
(to comply with the Basel II framework), and to what extent can these PDs be 
validated? Lastly, in many instances a quicker reaction of current conditions, 
sometimes coupled with a longer time horizon, might be needed for purposes of 
risk management and pricing, especially for retail exposures. How would such 


‘Fritz et al. (2002). 
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quantification processes for PDs (and LGDs) be rectified with the application of the 
use-test as required in Basel II? 

Another issue relates to the modification performed by a credit officer of the 
automated rating proposal, i.e. a qualitative adjustment. This may question the 
rigidity needed for validation, especially in cases where documentation may be 
insufficient, and the information used is more qualitatively based, the latter being a 
general problem in credit assessments. 

A simple, but important, question is who has the responsibility for validating a 
rating system in the context of Basel II, given that the calculation of minimum 
regulatory capital is legally binding and set by the supervisors. In addition, a valid 
point in this regard is that some requirements, for example, the quantification 
process focussing on long-term averages to reduce volatility of minimum regu- 
latory capital requirements, are not fully in line with bank practice. This may lead to 
a different quantification process, i.e. a second process for the sole purpose of 
meeting supervisory standards, or even to a different risk management process as 
suggested above in the retail portfolios. In sum, the use-test requirement, the extent to 
which an internal rating system is used in daily banking business, will play a crucial 
role in assessing compliance with Basel II implementation including the validation of 
IRB systems. 

Since a bank’s internal rating systems are individual, and in the best case, fully 
tailored to the bank’s necessities; validation techniques must be as individual as the 
rating system they are used for. As an example, we highlight the so-called Low- 
Default-Portfolios. As the IRB framework in Basel II is intended to apply to all 
asset classes, there are naturally portfolios which exhibit relatively low or even no 
default at all.” This makes the quantification, required to be grounded in historical 
experience, of PDs and LGDs, extremely challenging. Thus, a straightforward 
assessment based on historic losses would not be sufficiently reliable for the 
quantification process of the risk parameters, but conservative estimates serving 
as an upper benchmark may be derived (cf. Chap. 5). 

Some of the issues raised in this section have been discussed by the BCBS. 


12.1.3 Provisions by the BCBS 


The Subgroup on Validation (AIGV)° of the BCBS’ Accord Implementation Group 
(AIG) was established in 2004. The objective of the AIGV is to share and exchange 


*See BCBS newsletter No 6, “for example, some portfolios historically have experienced low 
numbers of defaults and are generally — but not always — considered to be low-risk (e.g. portfolios 
of exposures to sovereigns, banks, insurance companies or highly rated corporate borrowers)”. 
The Validation Subgroup is focusing primarily on the IRB approach, although the principles 
should also apply to validation of advanced measurement approaches for operational risk. A 
separate Subgroup has been established to explore issues related to operational risk (see BCBS 
newsletter No 4.). 
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views related to the validation of IRB systems. To the extent possible, the groups 
should also narrow gaps between different assessments of the New Framework by 
the different supervising agencies represented in the AIGV. The objective of the 
validation of a rating system is to assess whether a rating system can — and 
ultimately does — fulfil its task of accurately distinguishing and measuring credit 
risk. The common view describes the term “validation” as a means to combine 
quantitative and qualitative methods. If applied together, it should indicate whether 
a rating system measures credit risks appropriately and is properly implemented in 
the bank in question. 

The BCBS newsletter No. 4, January 2004, informs about the work of the AIGV 
in the area of validation in Basel II. The most important information provided was 
the relatively simple answer to the question, “what aspects of validation will be 
looked at?” Despite the importance of validation as a requirement for the IRB 
approach, the New Framework does not explicitly specify what constitutes valida- 
tion. Consequently, the Subgroup reached agreement on that question. In the context 
of rating systems, the term “validation” encompasses a range of processes and 
activities that contribute to assessing whether ratings adequately differentiate risk, 
and importantly, whether estimates of risk components (such as PD, LGD, or EAD) 
appropriately characterise and quantify the relevant risk dimension. 

Starting with this definition, the AIGV developed six important principles (see 
Fig. 12.2), on validation that result in a broad framework for validation. The 
validation framework covers all aspects of validation, including the goal of valida- 
tion (principle 1), the responsibility for validation (principle 2), expectations on 
validation techniques (principles 3, 4, and 5), and the control environment for 
validation (principle 6). Publishing these principles was a major step in clarifying 
the ongoing discussions between banks and their supervisors about validation for at 
least three reasons: 


1. The principles establish a broad view on validation. Quite often, validation was 
seen as being restricted to only dealing with aspects related to backtesting. The 
established broad view on validation reinforces the importance of the minimum 
requirements of the IRBA, as well as highlighting the importance of risk- 
management. The debate around the IRBA was too often restricted to solely 
risk quantification or measurement aspects. We think that this balanced perspec- 
tive, including the more qualitative aspects of the IRBA, reflects the short- 
comings in establishing and validating rating systems, especially given the 
data limitations. This clarification also formed the basis for the development 
of validation principles for the so-called “Low Default Portfolios (LDPs)” as 
proposed in the BCBS newsletter No. 6 from August 2005. 

2. The responsibility for validation and the delegation of duties has also been 
clarified. The main responsibility lies rightfully with the bank, given the 
importance of rating systems in the bank’s overall risk management and capital 
allocation procedures. Since validation is seen as the ultimate sanity-check for 
a rating system and all its components, this task clearly must be performed by 
the bank itself, including the final sign-off by senior management. It should be 
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Principle 1: Validation is fundamentally about assessing the predictive ability 

of a bank’s risk estimates and the use of ratings in credit processes 

The two step process for ratings systems requires banks to firstly discriminate 
adequately between risky borrowers (i.e. being able o discriminate between 
risks and its associated risk of loss) and secondly calibrate risk (i.e. being able 
to accurately quantify the level of risk). The IRB parameters must, as always 
with statistical estimates, be based on historical experience which should form 
the basis for the forward-looking quality of the IRB parameters. IRB valida- 
tion should encompass the processes for assigning those estimates including 
the governance and control procedures in a bank. 


Principle 2: The bank has primary responsibility for validation 

The primary responsibility for validating IRB systems lies with the banks it- 
self and does not remain with the supervisor. This certainly should reflect the 
self-interest and the need for banks having a rating system in place reflecting 
its business. Supervisors obviously must review the bank’s validation proc- 
esses and should also rely upon additional processes in order to get the ade- 
quate level of supervisory comfort. 


Principle 3: Validation is an iterative process 

Setting up and running an IRB system in real life is by design an iterative pro- 
cess. Validation, as an important part of this circle, should therefore be an on- 
going, iterative process following an iterative dialogue between banks and 
their supervisors. This may result in a refinement of the validation tools used. 


Principle 4: There is no single validation method 

Many well-known validation tools like backtesting, benchmarking, replica- 
tion, etc are a useful supplement to the overall goal of achieving a sound IRB 
system. However, there is unanimous agreement that there is no universal tool 
available, which could be used across portfolios and across markets. 


Principle 5: Validation should encompass both quantitative and qualitative 
elements 

Validation is not a technical or solely mathematical exercise. Validation must 
be considered and applied a broad sense, its individual components like data, 
documentation, internal use and the underlying rating models and all proc- 
esses which the rating system uses are equally important. 


Principle 6: Validation processes and outcomes should be subject to inde- 
pendent review 

For IRB systems, there must be an independent review within the bank. This 
specifies neither the organigram in the bank nor its relationship across de- 
partments, but the review team must be independent of designers of the IRB 
system and those who implement the validation process. 


Fig. 12.2 The six principles of validation 


noted that only banks can provide the resources necessary to validate rating 
systems. 

3. Principles 3-5 establish a comprehensive approach for validating rating systems. 
This approach proposes the key elements of a broad validation process, on which 
we will elaborate more in the next section. 
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12.2 Validation of Internal Rating Systems in Detail 


According to the BCBS elaboration on the term “validation”, we consider three 
mutually supporting ways to validate bank internal rating systems. This encom- 
passes a range of processes and activities that contribute to the overall assessment 
and final judgement. More specifically, this can be directly related to the application 
of principle 4 and principle 5 of the BCBS newsletter as discussed above. 


1. Component-based validation: — analyses each of the three elements — data 
collection and compilation, quantitative procedure and human influence — for 
appropriateness and workability. 

2. Result-based validation (also known as backtesting): — analyses the rating 
system’s quantification of credit risk ex post. 

3. Process-based validation: — analyses the rating system’s interfaces with other 
processes in the bank and how the rating system is integrated into the bank’s 
overall management structure. 


12.2.1 Component-Based Validation 


12.2.1.1 Availability of High-Quality Data 


Ensuring adequate data quality is the key task which, for at least two reasons, must 
be addressed with the greatest urgency. First, as the rating is based primarily on 
individual borrowers’ current data, it can only be as good as the underlying data. 
Second, the quantitative component of the rating process requires a sufficiently 
reliable set of data, including a cross-sectional basis, which is crucial for calibration 
of the risk parameters. Accordingly, both historical data and high-quality recent 
data are essential to ensure that a rating system can be set up adequately, and will 
also be successful in the future. Clearly, the availability of data, i.e. financial versus 
account specific information, and its use for different borrower characteristics, — 
wholesale versus consumer — is dissimilar. Activities in consumer finance may 
produce more bank-specific behavioural data whereas financial information for 
large wholesale borrowers should be publicly available. However, the availability 
of reliable and informative data, especially for the mid-size privately owned 
borrowers, may frequently not be met for at least several reasons: 


e Data compilation and quality assurance incur high costs because they require 
both qualified staff and a high-performance IT infrastructure. In addition, these 
tasks seem to have little to do with original banking business in its strict sense, 
and their usefulness may only become apparent years later. Clearly, proper 
investment is needed, adding pressure to costs and competition. 

e Similarly, it is a costly exercise in staffing and resource allocation in credit 
departments. However, the Basel II efforts may have helped to allocate more 
resources to capturing adequate and reliable credit data. 
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e In reality, borrowers also are often reluctant to supply the requested data. This 
may be because, especially at smaller enterprises, this data is not readily 
available. Admittedly, because of the predominant classic “house bank” system 
in Germany, this information historically had not been requested. Also, potential 
misuse of data and reluctance on the part of firms to provide information on their 
own economic situation seems to be a widespread concern. Sometimes, data is 
passed on to a very limited number of third parties only.’ 

e Further concentration in the banking industry is also contributing to the problem. 
Owing to the lack of uniform standards for banks, in the event of a merger, 
different sets of data have to be synchronized — this adds a new dimension to the 
problem and is, again, no quick and easy task to do. 


A thorough knowledge of the IT systems underlying the rating approach is 
necessary for the proper assessment of data quality; in addition the following may 
help to provide a realistic evaluation: 


e Ensuring data quality: The sheer existence and quality of bank internal guide- 
lines, including tests around them, is an indication of the importance banks place 
on good data quality. Whether a bank takes its own guidelines seriously can be 
gauged from day-to-day applications. For instance, data quality assurance can 
reasonably be expected to be as automated as possible to ensure that a uniform 
standard is applied throughout the bank. Also, comparison with external sources 
of data seems to be necessary to ensure data plausibility. 

e Bank-wide use of the data: The extent to which data are used allows assessing 
the confidence that the bank has in its data. This leads to two consequences. On 
the one hand, frequent and intensive use of specific data within a bank exposes 
inconsistencies which might exist. On the other hand, where larger numbers of 
people are able to manually adjust data, the more likely is its potential contami- 
nation, unless suitable countermeasures are taken. 


12.2.1.2 The Quantitative Rating Models 


The second facet of the rating process, in the broadest sense, is the mathematical 
approach which can be used to standardise the use of data. The aim is to compress data 
collected in the first stage to prepare and facilitate the loan officer’s decision on the 
credit risk assessment of a borrower. In recent years, the analysis and development of 
possible methods has been a focus of research at banks and in microeconomics. 

The second stage methods attribute to each borrower, via a rating function frar, a 
continuous or discrete risk measure Z, a rating score, which is dependent on both the 


7An indication of this attitude, which is widespread in Germany, is, for example, the approach that 
is adopted to the obligation laid down in Section 325 of the German Commercial Code for 
corporations to publish their annual accounts. No more than 10% of the enterprises concerned 
fulfil this statutory obligation. 
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individual features of each borrower X1, X2, ..., Xx — also denoted as risk factors — 
and free, initially unknown model parameters 01,2, ..., OM: 


Z = frar(o1,*** ,0m3X1,°+*, Xn). 


The value of Z permits the suggested rating to be derived from the quantitative 
analysis of the borrower concerned, in that each value of Z is allocated precisely to 
one of Y various rating categories. The methods suitable for this kind of quantitative 
component can be classified as: 


e Statistical methods: This is probably the best known and the most widespread group 
of methods. They are used by almost all banks in both corporate and private sector 
business. The best known of such methods are discriminatory analyses (primarily in 
corporate business) and logit regressions (used mainly as scorecards in private 
sector business). Generalised regression and classification methods (such as neural 
networks) also belong in this category, even if they are rarely used in practice. 

e Rule-based systems: Such systems model the way in which human credit experts 
reach a decision and are used in corporate business. They comprise a set of 
predetermined “if ... then” rules (i.e. expert knowledge). Each enterprise is first 
graded according to these rules. The next stage is for the rules matched by the 
firm to be aggregated in order to give a risk rating. 

e Benchmarking methods: In these methods, observable criteria, such as bond 
spreads, are used to compare borrowers with unknown risk content with rated 
borrowers with known risk content — the so-called benchmarks. 

e Applied economic models: Option price theory models are the most widely 
known. They enable, for example, an enterprise’s equity capital to be modelled 
as a call option on its asset value and thus the concepts used in option price theory 
to be applied to credit risk measurement. The starting point for the development of 
these models was the Merton model; KMV has been successful in its further 
development offering its Public Firm Model for listed enterprises and a Private 
Firm Model for unlisted enterprises (Crosbie and Bohn 2001), now marketed 
under the “Moody’s KMV”-label. 


Another classification distinguishes between empirical models, where the para- 
meters are determined from data of known borrowers by using mathematical or 
numerical optimisation methods, and expert methods, where the parameters are 
specified by credit experts based on their experience. Basically, the difference lies 
in the specification of the model parameters «1,0, ..., M- 


12.2.1.3 The Model Itself 


Transparency, intelligibility and plausibility are crucial for validating the appropri- 
ateness of the rating process. Clearly, either with the set of rules for expert systems 
or with the underlying model in the case of benchmarking methods and applied 
economic models, these requirements seem to be easily fulfilled. The situation 
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regarding statistical models is somewhat more complex — as there is no economic 
theory underlying these models. However, certain basic economic requirements 
should also be incorporated in using statistical models. For example, experience has 
shown that many risk factors are invariably more marked among “good” borrowers 
than “bad” borrowers. Likewise, if a requirement of risk measure Z is invariably 
larger among better borrowers than among worse borrowers, the direct consequence 
is that the monotony of the risk factor must also be evident in the monotony of the 
risk measure. Therefore, for the i-th risk factor X;, the following applies: 


OZ _ Ofrear(01,°* +, AM; Xis XN) 


aX; aX; oe 


Economic plausibility leads to the exclusion of “non-monotonous” risk factors in 
linear models. Non-monotonous risk factors are, for example, growth variables, 
such as changes in the balance sheet total, changes in turnover etc. Experience 
shows that both a decline and excessively high growth of these variables imply a 
high risk. Such variables cannot be processed in linear models, i.e. in models like 
Z = Uta,:X,+--+ay Xy, because, owing to 


OZ 
Ox, = %& = const., 
the plausibility criterion in these models cannot be fulfilled for non-monotonous 
features.* Further economic plausibility requirements and sensitivity analysis should 
be considered in a causal relationship with economic risk, for example the creditwor- 
thiness of an enterprise cannot be derived from the managing director’s shoe size! 
The commonly applied statistical standards must be observed for all empirical 
models (statistical models, specific expert systems and applied economic models). 
Non-compliance with these standards is always an indication of design defects, 
which generally exhibit an adverse effect when applied. Without claiming com- 
pleteness, we consider the following aspects to be vital when developing a model: 


e Appropriateness of the random sample for the empirical model: The appropri- 
ateness of the random sample is the decisive factor for all empirical and statisti- 
cal models. This is also relevant to applied economic models, as is illustrated by 
the KMV models. These models have been based on data on US firms, meaning 
that they draw on developments in the US markets only and solely reflect US 
accounting standards. Not all data which is important in this system is available 
when other accounting standards are used, with the result that when the models 
are transferred to other countries, one has to work with possibly questionable 
approximations. This has a bearing on certain characteristics of the models such 
as lack of ambiguity and the stability of the results. 


’Moody’s RiskCalc (Falkenstein et al. 2000) provides one way of processing non-monotonous risk 
factors by appropriate transformation in linear models. Another one can be found in Chap. 2. 
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Over-parameterising the model: A mistake, frequently observed, is to include too 
many risk factors in the design of a rating system. The reasons for this include an 
overly cautious approach when developing the system, i.e. each conceivable risk 
factor, or those which credit experts seem to consider obvious, are to be fed into 
the system. On the other hand, rating systems are often developed by committees 
and these would naturally like to see their particular “babies” (mostly a “favourite 
variable” or a special risk factor) reflected in the rating design. Neither approach is 
optimal from the statistical perspective as there is an upper limit to the number of 
parameters to be calculated, depending on the size of the sample and the model 
used. If this rule is breached, errors are made which are called “‘overfitting”. 
Statistical performance of the estimated model: The performance of the model in 
a Statistical sense is generally provided as a type-1 or a type-2 error, applying 
measures of inequality such as Gini coefficients or entropy measures (Falkenstein 
et al. 2000), or other statistical measures which can be determined either for the 
sample or the entire population. These variables quantify the rating system’s 
ability to distinguish between good and bad borrowers and thus provide important 
information about the capability of the rating model with regard to discriminating 
between risks. These variables are especially important during the development of 
a rating system as they allow comparison of the performance of various models 
within the same data set. However, we think that these tools are only of minor 
importance for ongoing prudential monitoring. First, owing to the concave form of 
the risk weighting function in the new Basel Accord, which provides logical 
incentives so that systems which discriminate more finely, are less burdened by 
regulatory capital than coarser systems. Second, the absolute size of the probabil- 
ity of default is the variable relevant for banking supervision as it is linked to the 
size of the regulatory capital. 

Modelling errors, precision and stability: Certain modelling errors are inevitably 
part of every model because each model can depict only a part of economic reality in 
a simplified form. In order to be able to use a model correctly, one has to be aware of 
these limitations. However, in addition to these limitations, which are to a certain 
extent a “natural” feature of each model, the modelling errors caused by using an 
optimisation or estimation procedure also need to be considered. These estimation 
errors can be quantified for the model parameters from the confidence levels of the 
model parameters. Given certain distribution assumptions, or with the aid of cyclical 
or rotation methods, these confidence levels can be determined analytically from the 
same data which is used to estimate the parameters (Fahrmeir et al. 1996). If error 
calculation methods frequently used in the natural sciences are applied, it is possible 
to estimate the extent to which measurement bias of the individual model parameters 
affects the credit score Z. The stability of a model can be derived from the confidence 
levels of model parameters. Determining the stability of a model seems to be 
particularly important, i.e. the responsiveness to portfolio changes. A more critical 
issue is model precision. In some methods, model parameters are determined — 
though illogically — with a precision that is several orders of magnitude higher than 
for the risk parameters. 
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12.2.1.4 Role of the Loan Officer-or Qualitative Assessment 


Loan officers play an important role in both setting up a rating system as well as 
using it in practice. We think that qualitative assessments should be included in the 
final rating assignment, by allowing the loan officer to modify the suggested credit 
rating provided by the quantitative model.’ This is certainly necessary for expo- 
sures above a certain size; retail loans may be dependent on the business activities 
and risk management structures in the bank. The sheer size of mass financing of 
consumer loans certainly results in less influence for the loan officer, rather, they 
rely on correct procedures to check the automated rating proposal and the input 
provided by the sales officer. We discuss three important aspects accordingly: 


e The loan officer s powers: Any manual modification of the automated rating 
proposal should be contained within a controlled and well-documented frame- 
work. The loan officer’s discretion should be set within clearly defined limits 
which specify at least the conditions permitting a deviation from the automated 
rating proposal and the information that the loan officer used additionally. One 
way to look at discretion is the use of a deviation matrix of final and suggested 
ratings, showing for each rating category, how many suggested ratings (gene- 
rated by the quantitative rating tool) are changed by manual override: more 
specifically, the share M;; of borrowers assigned by the quantitative system to the 
i-th category which loan officers finally place in category j. In a well-defined, 
practicable rating system, a high match between suggested ratings and final 
ratings should be expected in most cases, so in each line the values of M;; should 
be the largest and M;; should decrease the more the final ratings diverge from the 
suggestions. Clearly, greater deviations should lead to careful analysis of the 
shortcomings of the rating model, either indicating data issues or problems with 
the model itself. 

e Monitoring the ratings over time: Any rating system must ideally be monitored 
continuously and be able to process incoming information swiftly; however, 
ratings must be updated at least annually. This does also apply for consumer 
loans. However, the focus is on ensuring that loans and borrowers are still 
assigned to the correct pool, i.e. still exhibiting the loss characteristics and the 
delinquency status of the previously assigned pool. As such, different methodo- 
logies may be used, for example by using an account-specific behavioural score. 
For wholesale loans, it may be helpful to analyse the frequency distribution of 
the time-span between two successive ratings of all borrowers in a specific 
portfolio. The expected pattern is shown in Fig. 12.3: most borrowers are re- 
evaluated at regular intervals, roughly once every 12 months, but in between, “ad 
hoc ratings” are based on information deemed to be important and their fre- 
quency increases with the amount of time that has elapsed since the first rating. 
Between the two regular re-ratings, a whole group of the same type of borrowers 


°The normal transformation of qualitative information like family status, gender, etc into numeri- 
cal variables for the assessment of consumer loans would not replace such a qualitative oversight. 
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Fig. 12.3 Frequency distribution of the time-span between two successive ratings for all 
borrowers in one portfolio 


(e.g. enterprises in one sector) may occasionally be re-rated because information 
relevant to the rating of this group has been received. It should be possible to 
explain any divergent behaviour which, in any case, provides insights into the 
quality of the rating process. 

e Feedback mechanisms of the rating process: A rating system must take account of 
both the justified interests of the user — i.e. the loan officer — whose interest is 
driven by having a rating process which is lean, easy to use, comprehensible and 
efficient. On the other hand, the model developer is interested in a rating model 
which is theoretically demanding and as comprehensive as possible. Where 
interests conflict, these will need to be reconciled. It is all the more important 
that a rating system is checked whilst in operational mode, to ascertain whether the 
model which the process is based on is appropriate and sufficiently understood by 
the users. In any case, procedures must be implemented according to which a new 
version — or at least a new parameterisation — of the rating model is carried out. 


12.2.2 Result-Based Validation 


In 1996, the publication of capital requirements for market risk for a bank’s trading 
book positions as an amendment to the 1988 Basel Accord, was the first time that a 
bank’s internal methodology could be used for purposes of regulatory capital. The 
output of bank internal models, the so-called Value-at-Risk (VaR) which is the 
most popular risk measure in market risk, is translated into a minimum capital 
requirement, i.e. three times VaR. The supervisory challenge for most countries, 
certainly Germany, was to establish an appropriate supervisory strategy to finally 
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permit these bank internal models for calculating regulatory capital. In addition to 
the supervisory assessment of the qualitative market risk environment in a bank, 
another crucial element of the strategy was the implementation of an efficient 
“top-down” monitoring approach for banks and banking supervisors. The relatively 
simple comparison between ex-ante estimation of VaR and ex-post realisation of 
the “clean” P&L" of a trading book position, excluding extraneous factors such as 
interest payments, was the foundation for the quantitative appraisal. 

The concept for backtesting in the IRBA as introduced in paragraph 501 of the 
New Framework is relatively similar. In the IRB approach, according to market 
risk, the probability of default (PD) per rating category or, in special cases, the 
expected loss (EL) in the case of consumer loans, must be compared with the 
realised default rate or losses that have occurred. 

Despite the basic features common to market risk and credit risk, there are also 
important differences, most importantly the following two. First, the conceptual 
nature is different; in market risk the forecasted VaR is a percentile of the “clean” 
P&L distribution. This distribution can be generated from the directly observable 
profit and losses, and thus the VaR can be directly observed. By contrast, in credit 
risk only realised defaults (and losses) according to a specific definition can be 
observed directly instead of the forecasted PD (and EL). 

A common and widespread approach for credit risk is the application of the law 
of large numbers and to infer from the observed default rate, the probability of 
default.'' To our knowledge, almost all backtesting techniques for PD (or EL) rely 
on this statistical concept. However, a proper application requires that borrowers 
are grouped into grades exhibiting similar default risk characteristics.'* This is 
necessary even in the case of direct estimates of PD, when each borrower is 
assigned an individual PD. 

The second main difference relates to the available data history on which the 
comparison is based. In market risk, the frequency is at least 250 times a year in the 
case of daily data. By contrast, in credit risk there is only one data point per annum 
to be assumed. To make it more complex, there is an additional problem arising 
from measuring credit default, which is the key variable for the quantification and 
therefore the validation. The definition of credit default is largely subjective. The 
New Framework suggests retaining this subjective element as the basis of the IRB 


10There are different interpretations among different supervisors on this issue. 


"Beside the fact, that an application of the law of large numbers would require that defaults are 
uncorrelated, there is another subtle violation in the prerequisites for applying the law of large 
numbers. It is required that the defaults stem from the same distribution. This requirement cannot 
be seen to be fulfilled for different borrowers. To give a picture: The difference for the task of 
determining the probability of throwing a six is like approximating this probability either by 
throwing the same dice 1,000 times and calculating the ratio of sixes to the total number of throws 
or throwing 1,000 dices once and calculating the ratio of sixes to the number of dices thrown. 
We believe that validation of rating systems, i.e. the calibration of PDs is almost impossible 
without the grouping of borrowers to grades with the same risk profile; which is also one of the key 
requirements of Basel II. 
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approach, albeit with a forward-looking focus and a back-stop ratio of 90 days past 
due. This may be justified, not least by the fact that a significant number of defaulted 
borrowers seem to have a considerable influence on the timing of the credit default. 

Correspondingly, the criteria and — more importantly — the applied methodology 
are also different. In market risk, the challenge is to provide a clean P&L and to 
store the corresponding data. This differs significantly from the necessary compila- 
tion of the rating history and credit defaults over time. Depending on the required 
reproducibility of the results, considerable time and effort may be needed and it is 
difficult to estimate what requirement is most important for what area, thus entai- 
ling higher costs for the bank. Owing to the volume of available data points in 
market risk, the simplicity and multiplicity of the applicable methods are impres- 
sive. This naturally poses an apparently insuperable challenge for credit risk. 

A further problem is the impact of a rating philosophy on backtesting. The rating 
philosophy is what is commonly referred to as either Point-in-Time (PIT) or 
Through-the-Cycle (TTC) ratings. PIT-ratings measure credit risk given the current 
state of a borrower in its current economic environment, whereas TTC-ratings 
measure credit risk taking into account the (assumed) state of the borrower over a 
“full” economic cycle. This means — assuming an underlying two-step-rating 
process as described above — a TTC rating system requires (1) constant PDs per 
grade over time and (2) a structure that let migrate borrowers on idiosyncratic 
effects only, i.e. no migration of borrowers between grades related to cycle effects. 
Consequently, a TTC-rating system must ensure that there is virtually no correla- 
tion between “grade migration” and the cycle. Similarly, PD estimates for each 
rating grade must not change in a way which is correlated with the cycle 

An alternative might be to make some adjustments to the basic outputs in order to 
achieve an acyclical effect. An example for that is given by the UK-FSA’s scaling 
approach. '* 

PIT and TTC mark the ends of the spectrum of possible rating systems. In 
practice, neither pure TTC nor pure PIT systems will be found, but hybrid systems, 
which are rather PIT or rather TTC. Agency ratings are assumed to be TTC, 
whereas current bank internal systems — at least in most cases in Germany and 
many other countries — are looked at as PIT. This is plausible because for the 
purpose of managing credit risk a PIT-system, that detects borrowers’ deterioration 
early, seems to be more reasonable than a TTC-system. 

The increased focus on reducing excess cyclicality in minimum capital require- 
ments by supervisors may lead banks to possibly promote the use of TTC ratings 
versus PIT ratings. A bank then may decide to explicitly use two different PD 
calibrations, one for internal purposes (PIT for example for pricing, margining and 
remuneration) and one for regulatory purposes (TTC for example for regulatory 
capital). In this case a very important question to be asked is whether this may be 
appropriate in the light of the requirement of the use-test. To this end, as long as the 
internal processes, i.e. the credit granting process as well as the rating assignment 


!3Cf, Financial Services Agency (2006, 2009). 
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process, stay the same for both calibrations we would suggest that the use-test criteria 
may be acceptable. In addition, importance should also be given to the fact that 
broader system-wide economic changes on the PD estimate should not be reflected 
in the TTC estimates in order to reduce its idiosyncratic risk induced volatility. 

The rating philosophy has an important impact on backtesting. In theory, for TTC 
systems borrower ratings, i.e. its rating grade, are stable over time, reflecting the 
long-term full cycle assessment. However, the observed default rates for the indi- 
vidual grades are expected to vary over time in accordance with the change in the 
economic environment. The contrary is the case for PIT systems. By more quickly 
reacting to changing economic conditions, borrower ratings tend to migrate through 
the rating grades over the cycle, whereas the PD for each grade is expected to be 
more stable over time, i.e. the PD is more independent from the current economic 
environment. The Basel Committee did not favour a special rating philosophy. Both 
PIT systems as well as TTC systems are fit for the IRBA. However, it seems to be 
reasonable to look at risk parameters as a forecast for their realisations which can be 
observed within a 1 year time horizon. This reasoning is reflected in the first 
validation principle of the AIGV, where a forward looking element is required to 
be included in the estimation of Basel’s risk parameters. 

However, validation of TTC-ratings is extremely challenging if it is looked at 
from the perspective of backtesting since for TTC ratings the target for PD 
calibration reflects an average for the cycle. If statistical testing techniques are to 
be applied, then the requirement for the length of a time series will be increased by 
the factor of a cycle length in years. Additionally, backtesting requires the integra- 
tion of full cycles only. Therefore the accuracy of the risk quantified in TTC ratings 
is difficult to evaluate and straightforward backtesting techniques, as sketched out 
in many articles of this book, are expected to be of limited value. 

In the special case of consumer loans, the estimation and validation of key 
parameters is extremely dependent on the approach taken by a bank. A similar 
rating system as used for wholesale borrowers, leads to an analogous assessment for 
purposes of validation. In contrast, instead of rating each borrower separately, the 
BCBS clusters loans in homogenous portfolios during the segmentation process 
(see above). This segmentation process should include assessing borrower and 
transaction risk characteristics like product type etc., as well as identifying the 
different delinquency stages (30 days, 60 days, 90 days etc.). Subsequently, the risk 
assessment on a (sub-) portfolio level could be based on its roll rates, transaction 
moving from one delinquency stage to another. 

The implications of these rather general considerations and possible solutions for 
the problems raised here are discussed in detail in Chap. 9. 


12.2.3 Process-Based Validation 


Validating rating processes includes analysing the extent to which an internal rating 
system is used in daily banking business. The use test and associated risk estimates 
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is one of the key requirements in the BCBS’ final framework. There are two 
different levels of validation. Firstly, the plausibility of the actual rating in itself, 
and secondly, the integration of ratings output in the operational procedure and 
interaction with other processes: 


Understanding the rating system: It is fundamental to both types of analysis that 
employees understand whichever rating methodology is used. The learning 
process should not be restricted to loan officers. As mentioned above, it should 
also include those employees who are involved in the rating process. In-house 
training courses and other training measures are required to ensure that the 
process operates properly. 

Importance for management: Adequate corporate governance is crucial for 
banks. In the case of a rating system, this requires the responsibility of executive 
management and to a certain extent the supervisory board, for authorising the rating 
methods and their implementation in the bank’s day-to-day business. We would 
expect different rating methods to be used depending on the size of the borrower, '* 
and taking account of the borrowers’ different risk content and the relevance of the 
incoming information following the decision by senior management. 

Internal monitoring processes: The monitoring process must cover at least the 
extent and the type of rating system used. In particular, it should be possible to 
rate all borrowers in the system, with the final rating allocated before credit is 
granted. If the rating is given after credit has been granted, this raises doubts 
about the usefulness of internal rating. The same applies to a rating which is not 
subject to a regular check. There should be a check at least annually and whenever 
new information about the debtor is received which casts doubt on their ability to 
clear their debts. The stability of the rating method over time, balanced with the 
need to update the method as appropriate, is a key part of the validation. To do 
this, it is necessary to show that objective criteria are incorporated so as to lay 
down the conditions for a re-estimation of the quantitative rating model or to 
determine whether a new rating model should be established. 

Integration in the bank’ s financial management structure: Unless rational credit 
risk is recorded for each borrower, it is impossible to perform the proper margin 
calculation taking into account standard risk costs. If this is to be part of bank 
management by its decision-making and supervisory bodies, a relationship must 
be determined between the individual rating categories and the standard risk 
costs. However, it must be borne in mind that the probability of default is simply 
a component of the calculation of the standard risk costs and, similarly to the 
credit risk models, other risk parameters, such as the rate of debt collection and 
the size of the exposure in the event of a credit default, the maturity of the loan, 
transfer risk and concentration risks should also be recorded. Ultimately the 
gross margin, which approximates to the difference between lending rates and 


14n the Basel Committee’s new proposals in respect of the IRB approach, small enterprises may, 
for regulatory purposes, be treated as retail customers and, unlike large corporate customers, small 
and medium-sized enterprises are given a reduced risk weighting in line with their turnover. 
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refinancing costs, can act as a yardstick for including the standard risk costs. In 
order to determine the concentration risks at portfolio level more appropriately, 
it seems essential to use credit risk models and thus to be in a position to allocate 
venture capital costs appropriately. Therefore, if net commission income is 
added to the gross margin, the operational costs netted out, and also the venture 
capital costs taken into account, it is possible to calculate the result of lending 
business. It is naturally advisable to include as part of the management of the 
bank, all other conventional instruments of credit risk measurement, such as 
employee bonus systems, portfolio optimisation. 


In principle, the Basel Committee requires these mainly portfolio-based methods 
in the second pillar of the new Accord as part of the self-assessment of capital 
adequacy required of the banks in the Capital Adequacy Assessment Process 
(CAAP). This frequently leads to problems when integrating banks’ own rating 
systems into credit risk models purchased from specialist providers. In our view, this 
may ultimately increase the complexity for banks and banking supervisors and at the 
same time entail considerable competitive distortions if the rating is less objective. 


12.3 Concluding Remarks 


To set up and to validate bank-internal rating systems is a challenging task and 
requires a considerable degree of sensitivity (Neale 2001). Our analysis started with 
the comparatively more difficult data situation and the availability of public and 
private information in order to quantify credit risk of banks’ borrowers in a 
structured way including its subsequent validation. The advantage of the structured 
credit risk assessment, when applying an automated rating process, is its objecti- 
vity. This is true for the rating method and for the selection of the risk factors in the 
rating model, including their effectiveness in generating a rating proposal. The final 
integration of the qualitative credit assessment, based on a subjective decision by 
the loan officer, is more difficult in the structured assessment. 

The final rating outcome comprises an array of individual observations, which 
may provide very different results. Ultimately, our suggested approach to validation 
takes this complexity into account by highlighting the importance of the rating 
process. This interdependence is reflected in the ongoing cycle of setting up and 
monitoring the rating system. Individual observations during the monitoring pro- 
cess are frequently integrated quickly into a revision of the methodological process. 

The validation method is analogous to a jigsaw puzzle. Only if the many 
individual pieces are being assembled properly, will the desired result be achieved. 
The individual pieces of the puzzle seem unimpressive and often unattractive at first, 
but they eventually contribute to the ultimate picture. This may, for example, be an 
appropriate description when setting up the system and conducting ongoing checks 
on the quality of the data management or the ongoing adjustment of banks’ internal 
credit standards. Each piece of the puzzle is crucial, to both component-based and 
process-based validation. One crucially important piece is the process-based 
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component. All conventional methods of quantitative validation should encompass 
the assessment of the rating tool’s economic meaningfulness as well as its compli- 
ance with statistical standards. 

Transparency and comprehensibility of the chosen methods at each stage of 
development, as well as its plausibility, are fundamental requirements of a sound 
rating system. The advantage of using empirical statistical approaches is that these 
models are comprehensible and that defects or statistical shortcomings can be 
detected by simple statistical tests. By contrast, rule-based systems and applied 
economic models are more heavily model-dependent and therefore point to model 
risk. In the case of benchmarking methods; however, the choice of the peer group 
with known risk content is decisive, although the instability of such models, in 
particular, can be a problem. Despite the differences, most applied methods can 
fulfil all requirements initially, albeit to a differing degree. 

The broad use and the interplay of different quantitative plausibility and valida- 
tion methods is the basis of a quantitative analysis of the methods used. Backtesting 
using a simple retrospective comparison of estimated default probabilities with 
actual default rates is crucial, and therefore a decisive element in the validation of 
the results.'° Complementary methods are also needed, particularly in the develop- 
ment stage of rating models, in order to ensure the plausibility of the selected 
methods. These include techniques which underscore the stability and accuracy of 
the methods, although caution is required with regard to quantification and espe- 
cially with regard to methods used to measure accuracy. 

The validation of internal rating systems underscores the importance of using a 
formalised process when devising them and in their daily application. This covers 
both the formalised keying in of data and the criteria for subjectively “overruling” 
the rating proposal. Unless internal ratings are used on a regular basis and in a 
structured manner over time, banks and banking supervisors by referring to the 
“use-test” will find difficulties in accepting such a rating system. 
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Chapter 13 
Measures of a Rating’s Discriminative Power: 
Applications and Limitations 


Bernd Engelmann 


13.1 Introduction 


A key attribute of a rating system is its discriminative power, i.e., its ability to 
separate good credit quality from bad credit quality. Similar problems arise in other 
scientific disciplines. In medicine, the quality of a diagnostic test is mainly deter- 
mined by its ability to distinguish between ill and healthy persons. Analogous 
applications exist in biology, information technology, and engineering sciences. 
The development of measures of discriminative power dates back to the early 
1950s. An interesting overview is given in Swets (1988). 

Many of the concepts developed in other scientific disciplines in different 
contexts can be transferred to the problem of measuring the discriminative power 
of a rating system. Most of the concepts presented in this article were developed in 
medical statistics. We will show how to apply them in a ratings context. 

Throughout this chapter, we will demonstrate the application of all concepts on 
two prototype rating systems which are developed from the same data base. We 
consider only rating systems which distribute debtors in separate rating categories, 
i.e., the rating system assigns one out of a finite number of rating scores to a debtor. 
For both rating systems, we assume that the total portfolio consists of 1,000 debtors, 
where 50 debtors defaulted and 950 debtors survived. Both rating systems assign 
five rating scores 1, 2, 3, 4, and 5 to debtors where 1 stands for the worst credit 
quality and 5 for the best. Table 13.1 summarizes the rating scores that were 
assigned to the surviving debtors by both rating systems. 

Table 13.1 tells us precisely the distribution of the non-defaulting debtors on the 
two rating systems. For example, we can read from Table 13.1 that there are 40 non- 
defaulting debtors who were assigned into rating category 4 by Rating 1 while they 
were assigned into rating category 5 by Rating 2. The other numbers are interpreted 
analogously. The distribution of the defaulting debtors in the two rating systems 
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Table 13.1 Distribution of the non-defaulting debtors in Rating 1 and Rating 2 


Rating 1 
1 2 3 4 5 Total 
Rating 2 1 90 60 15 10 5 180 

2 45 90 30 20 15 200 
3 10 35 100 45 20 210 
4 5 10 30 100 70 215 
5 0 5 10 40 90 145 
Total 150 200 185 215 200 


Table 13.2 Distribution of the defaulting debtors in Rating 1 and Rating 2 


Rating 1 
1 2 3 4 5 Total 
Rating 2 1 20 5 0 3 0 28 
2, 4 7 0 0 0 11 
3 3 0 2 0 0 5 
4 0 0 0 2 2 4 
5 0 2 0 0 0 2 
Total 27 14 2 5 2 


is given in Table 13.2. Both tables provide all information needed to apply the 
concepts that will be introduced in the subsequent sections of this chapter. 

We introduce the notation that we will use throughout this chapter. We assume 
a rating system which consists of discrete rating categories. The rating categories! 
are denoted with R,..., Rą where we assume that the rating categories are sorted in 
increasing credit quality, i.e., the debtors with worst credit quality are assigned to 
R, while the debtors with the best credit quality are assigned to Rz. In our example 
in Tables 13.1 and 13.2 we have k = 5 and R; = 1,..., Rs = 5. We denote the set 
of defaulting debtors with D, the set of non-defaulting debtors with ND, and the set 
of all debtors with T. The number of debtors in the rating category R; is denoted 
with N(i) where the subscript refers to the group of debtors. If we discuss a specific 
rating we make this clear by an additional argument, e.g., for Rating 1 the number 
of defaulters in rating category 4 is Np(4;1) = 5, or the total number of debtors in 
rating category 2 is N7(2;1) = 214. Since the event “Default” or ““Non-default” of a 
debtor is random, we have to introduce some random variables. With $ we denote 
random distribution of rating scores while the subscript will indicate the group of 
debtors the distribution function corresponds to, e.g., Sp denotes the distribution of 
the rating scores of the defaulting debtors. The empirical distribution of the rating 
scores, i.e., the distribution of the rating scores that is realised by the observed 
defaults and non-defaults is denoted by $, where the subscript again refers to the 
group of debtors. For example, for Rating 1 


'The terminology rating category or rating score is used interchangeably throughout this chapter. 
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Sp(3;1) = 2/50 = 0.04, 
Syp(2;1) = 200/950 = 0.21, 
Ŝr(5:1) = 202/1000 = 0.20. 


The cumulative distribution of S is denoted with C, i.e., C(R;) is the probability 
that a debtor has a rating score lower than or equal to R;. The specific group of 
debtors the distribution function is referring to is given by the corresponding 
subscript. The empirical cumulative distribution function is denoted by G ep., 
the empirical probability that a non-defaulting debtor’s rating score under Rating 
2 is lower than or equal to “4” is given by 


Cy (4;2) = (180 + 200 + 210 + 215) /950 = 0.847. 


Finally, we define the common score distribution of two rating systems Rating 1 
and Rating 2 by S'”. The expression $'7(R;,R j) gives the probability that a debtor has 
rating score R; under Rating 1 and a rating score R; under Rating 2. Again the index 
D, ND, T refers to the set of debtors to which the score distribution corresponds. The 
cumulative distribution is denoted with C'’, i.e., C "(RR j) gives the probability 
that a debtor has a rating score less than or equal to R; under Rating 1 and less than 
or equal to R; under Rating 2. Again, examples are given for the corresponding 
empirical distributions using the data of Tables 13.1 and 13.2: 


$2 (2,2) = 7/50 = 0.14, 
S12. (2,4) = 10/950 = 0.0105, 
62(2,3) = (20 +5 +4 +7 + 3 + 0)/50 = 0.78. 


Having defined the notation, we give a short outline of this chapter. In Sect. 13.2 
we will define the measures, Cumulative Accuracy Profile (CAP) and Receiver 
Operating Characteristic (ROC), which are the most popular in practice and show 
how they are interrelated. In Sect. 13.3 we will focus on the statistical properties of 
the summary measures of the CAP and the ROC. The final section discusses the 
applicability and the correct interpretation of these measures. 


13.2 Measures of a Rating System’s Discriminative Power 


We will define the measures of discriminative power that are of interest to us in this 
section. We will focus on the Cumulative Accuracy Profile (CAP) and the Receiver 
Operating Characteristic (ROC). These are not the only measures described in the 
literature but the most important and the most widely applied in practice. Examples 
of measures that are not treated in this article are entropy measures. We refer the 
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reader to Sobehart et al. (2000) for an introduction to these measures. Besides the 
basic definitions of the CAP and the ROC and their summary measures, we will 
show how both concepts are connected and explore some extensions in this section. 


13.2.1 Cumulative Accuracy Profile 


The definition of the Cumulative Accuracy Profile (CAP) can be found in Sobehart 
et al. (2000). It plots the empirical cumulative distribution of the defaulting debtors 
Cp against the empirical cumulative distribution of all debtors Cr. This is illu- 
strated in Fig. 13.1. For a given rating category R;, the percentage of all debtors with 
a rating of R; or worse is determined, i.e., Ĉ 7(R;). Next, the percentage of defaulted 
debtors with a rating score worse than or equal to R;, ie.,C 'p(R;), is computed. This 
determines the point A in Fig. 13.1. Completing this exercise for all rating cate- 
gories of a rating system determines the CAP curve. Therefore, every CAP curve 
must start in the point (0, 0) and end in the point (1, 1). 

There are two special situations which serve as limiting cases. The first is a rating 
system which does not contain any discriminative power. In this case, the CAP 
curve is a straight line which halves the quadrant because if the rating system 
contains no information about a debtor’s credit quality it will assign x% of the 
defaulters among the x% of the debtors with the worst rating scores (“Random 
Model” in Fig. 13.1). The other extreme is a rating system which contains perfect 
information concerning the credit quality of the debtors. In this case, all defaulting 
debtors will get a worse rating than the surviving debtors and the resulting CAP 
curve raises straight to one and stays there (“Perfect Forecaster” in Fig. 13.1). 


Cp 


ee ae Perfect Forecaster 


Rating Model 


Random Model 


Fig. 13.1 Illustration of cumulative accuracy profiles 
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The information contained in a CAP curve can be summarised into a single 
number, the Accuracy Ratio (AR) (this number is also known as Gini Coefficient or 
Power Statistics). It is given by 


, (13.1) 


where ap is the area between the CAP curve of the rating model and CAP curve of 
the random model (grey/black area in Fig. 13.1) and ap is the area between the CAP 
curve of the perfect forecaster and the CAP curve of the random model (grey area in 
Fig. 13.1). The ratio AR can take values between zero and one.” The closer AR is to 
one, i.e., the more the CAP curve is to the upper left, the higher is the discriminative 
power of a rating model. 

We finish this subsection by calculating the CAP curves of Rating 1 and Rating 2. 
Since both rating systems have five rating categories, we can compute four points 
of the CAP curve in addition to the points (0,0) and (1,1). To get a real curve, the 
six points of each CAP curve have to be connected by straight lines. We illustrate 
this procedure for Rating 1. Starting at the left, we have to compute Ĉĉ r(1;1) and 


Cp(1;1), which we get from Tables 13.1 and 13.2 as 


Cr(1;1) = 177/1000 = 0.177, 
Cp(1;1) = 27/50 = 0.540. 


In the next step, we compute Ĉr(2;1) and Cp(2;1) which results in 


Cr(2;1) = (177 + 214) /1000 = 0.391, 
Cp(2;1) = (27 + 14) /50 = 0.820. 


The remaining points are computed analogously. The procedure for Rating 
2 is similar. The resulting CAP curves are illustrated in Fig. 13.2. We see that the 
CAP curve of Rating | is always higher than the CAP curve of Rating 2, i.e., the 
discriminative power of Rating 1 is higher. This is also reflected in the AR values of 
both rating models. For Rating 1, we find an AR of 0.523 while for Rating 2, the AR 
is calculated as 0.471. 


13.2.2 Receiver Operating Characteristic 


The concept of the Receiver Operating Characteristic (ROC) was developed in 
signal detection theory, therefore the name. It was introduced to rating systems in 
Sobehart and Keenan (2001). The concept is illustrated in Fig. 13.3. This figure 


"In principle, AR could be negative. This would be the case when the ranking of the debtors by the 
rating system is wrong, i.e., the good debtors are assigned to the rating categories of the poor 
debtors. 
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CAP Curves 
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Fig. 13.2 CAP curves for Rating 1 and Rating 2 
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Fig. 13.3 Rating score distributions for defaulting and non-defaulting debtors 


shows the distributions of the rating scores for defaulting and non-defaulting debt- 
ors. It can be seen that the rating system has discriminative power since the rating 
scores are higher for surviving debtors. A cut-off value V provides a simple decision 
tule to classify debtors into potential defaulters and non-defaulters. All debtors with 
a rating score lower than V are considered as defaulters while all debtors with a 
rating score higher than V are treated as non-defaulters. Under this decision rule 
four scenarios can occur which are summarised in Table 13.3. 
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If a debtor with a rating score below V defaults, the rating system’s prediction 
was correct. We call the fraction of correctly forecasted defaulters the “hit rate”. 
The same is true for non-defaulters with a rating score above V. In this case, a non- 
defaulter was predicted correctly. If a non-defaulter has a rating score below V, the 
decision was wrong. The rating system raised a false alarm. The fourth and final 
case is a defaulter with a rating score above V. In this case the rating system missed 
a defaulter and made a wrong prediction. 

For a given cut-off value V, a rating system should have a high hit rate and a low 
false alarm rate. The Receiver Operating Characteristic curve is given by all pairs 
(false alarm rate, hit rate), which are computed for every reasonable cut-off value. It 
is clear that the ROC curve starts in the point (0, 0) and ends in the point (1, 1). If the 
cut-off value lies below all feasible rating scores both the hit rate and the false alarm 
rate is zero. Similarly, if the cut-off value is above all feasible rating scores, the hit 
rate and the false alarm rate are equal to one. The concept of the ROC curve is 
illustrated in Fig. 13.4 below. 

In our setting, the cut-off points V are defined by the rating categories. Therefore, 
we get in total k-1 cut-off points. Consider the point B in Fig. 13.4. To compute this 
point we define the decision rule: A debtor is classified as a defaulter if he has a 
rating of R; or worse, otherwise he is classified as a non-defaulter. Under this 


Table 13.3 Outcomes of the simple classification rule using the cut-off value V 
Default Non-default 
Rating score Below cut-off value Correct prediction (hit) Wrong prediction (false alarm) 
Above cut-off value Wrong prediction (error) Correct prediction (correct 
rejection) 


Rating Model 


eis Perfect Forecaster 


Random Model 


0 Cyo(Ri) ; Cup 


Fig. 13.4 Illustration of receiver operating characteristic curves 
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decision rule, the hit rate is given by Ĉĉ D(Ri), which is the fraction of all defaulters 
with a rating of R; or worse. Similarly, the false alarm rate is given by Ônp(R;i), 
which is the fraction of all non-defaulters with a rating of R; or worse. The ROC 
curve is obtained by computing these numbers for all rating categories. 

Again, we have the two limiting cases of a random model and the perfect 
forecaster. In the case of a random model where the rating system contains no 
discriminative power, the hit rate and the false alarm rate are equal regardless of the 
cut-off point. In the case of the perfect forecaster, the rating scores distributions of 
the defaulters and the non-defaulters of Fig. 13.3 are separated perfectly. Therefore, 
for every value of the hit rate less than one the false alarm rate is zero and for every 
value of the false alarm rate greater than zero, the hit rate is one. The corresponding 
ROC curve connects the three points (0, 0), (0, 1), and (1, 1) by straight lines. 

Similar to the CAP curve, where the information of the curve was summarized in 
the Accuracy Ratio, there is also a summary statistic for the ROC curve. It is the 
area below the ROC curve (AUROC). This statistic can take values between zero 
and one, where the AUROC of the random model is 0.5 and the AUROC of the 
perfect forecaster is 1.0. The closer the value of AUROC is to one, i.e., the more the 
ROC curve is to the upper left, the higher is the discriminative power of a rating 
system.” 

We apply the concept of the ROC curve to the example in Tables 13.1 and 13.2. 
We proceed in the same way as in the previous subsection, when we computed the 
CAP curve. Since we have five rating categories, we can define four decision rules 
in total which gives us four points in addition to the points (0, 0) and (1, 1) on the 
ROC curve. To get a curve, the points have to be connected by straight lines. We 
compute the second point of the ROC curve for Rating 2 to illustrate the procedure. 
The remaining points are computed in an analogous way. Consider the decision rule 
that a debtor is classified as a defaulter if he has a rating of “2” or worse and is 
classified as a non-defaulter if he has a rating higher than “2”. The corresponding hit 
rate is computed as 


Cp(2;2) = (28 + 11)/50 = 0.78, 
while the corresponding false alarm rate is given by 
Cwp(2;2) = (180 + 200)/950 = 0.40. 
The remaining points on the ROC curve of Rating 2 and Rating | are computed 
in a similar fashion. The ROC curves of Rating 1 and Rating 2 are illustrated in 


Fig. 13.5. Computing the area below the ROC curve, we get a value of 0.762 for 
Rating 1 and 0.735 for Rating 2. 


3A rating system with an AUROC close to zero also has a high discriminative power. In this case, 
the order of good and bad debtors is reversed. The good debtors have low rating scores while the 
poor debtors have high ratings. 
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ROC Curves 
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Fig. 13.5 ROC curves for Rating 1 and Rating 2 


We finish this subsection by exploring the connection between AR and AUROC. 
We have seen that the CAP curve and the ROC curve are computed in a similar 
way. In fact, it can be shown that both concepts are just different ways to represent 
the same information. In Appendix A, we proof the simple relation between AR and 
AUROC 


AR =2. AUROC — 1. (13.2) 


From a practical perspective, both concepts are equivalent and it is a question 
of preference as to which one is used to evaluate the discriminative power of a 
rating system. In Sect. 13.3, we will see that AUROC allows for an intuitive 
probabilistic interpretation which can be used to derive various statistical proper- 
ties of AUROC. By (13.2) this interpretation carries over to AR. However, it is 
less intuitive in this case. 


13.2.3 Extensions 


CAP curves and ROC curves only allow a meaningful evaluation of some rating 
function’s ability to discriminate between “good” and “bad” if there is a linear 
relationship between the function’s value and the attributes “good” and “bad”. This 
is illustrated in Fig. 13.6. The figure shows a situation where the rating is able to 
discriminate perfectly between defaulters and survivors. However, the score distri- 
bution of the defaulters is bimodal. Defaulters have either very high or very low 
score values. In practice, when designing corporate ratings, some balance sheet 
variables like growth in sales have this feature. 
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Fig. 13.6 Score distribution of a non-linear rating function 
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Fig. 13.7 ROC curve corresponding to the score distribution of Fig. 13.6 


A straight forward application of the ROC concept to this situation results in a 
misleading value for AUROC. The ROC curve which corresponds to the rating 
distribution of Fig. 13.6 is shown in Fig. 13.7. It can be seen that the AUROC 
corresponding to the score distribution in Fig. 13.6 is equal to 0.5. In spite of the 
rating system’s ability to discriminate perfectly between defaulters and non-defaulters, 
its AUROC is the same as the AUROC of a rating system without any discrimina- 
tive power. This is due to the non-linearity in the relationship between the rating 
score and credit quality of the debtors. 

To obtain meaningful measures of discriminatory power also in this situation, 
Lee and Hsiao (1996) and Lee (1999) provide several extensions to the AUROC 
measure we have introduced in Sect. 13.2.2. We discuss only one of these exten- 
sions, the one which could be most useful in a rating context. 
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Lee (1999) proposes a simple modification to the ROC concept which delivers 
meaningful results for score distributions as illustrated in Fig. 13.6. For each rating 
category the likelihood ratio L is computed as 


L(Ri) = oa. (13.3) 


The likelihood ratio is the ratio of the probability that a defaulter is assigned 
to rating category R; to the corresponding probability for a non-defaulter. To 
illustrate this concept, we compute the empirical likelihood ratio L which is 
defined as 


sy _ Sp(Ri) 
LIR) = 3 Ry’ (13.4) 


for the rating systems Rating 1 and Rating 2. The results are given in Table 13.4. 

In the next step, the likelihood ratios are sorted from the highest to the least. 
Finally, the likelihood ratios are inverted to define a new rating score.” In doing so, 
we have defined a new rating score that assigns low score values to low credit 
quality. The crucial point in this transformation is that we can be sure that after the 
transformation, low credit quality corresponds to low score values even if the 
original data looks like the data in Fig. 13.6. 

We compute the ROC curves for the new rating score. They are given in 
Fig. 13.8. Note that there is no difference to the previous ROC curve for Rating 
2 because the sorting of the likelihood ratios did not change the order of the rating 
scores. However, there is a difference for Rating 1. The AUROC of Rating 1 has 
increased slightly from 0.7616 to 0.7721. Furthermore, the ROC curve of Rating 1 


Table 13.4 Empirical likelihood ratios for Rating 1 and Rating 2 
Rating category 


1 2 3 4 5 
Rating 1 $p(Ris1) 0.54 0.28 0.04 0.10 0.04 
Swn (R;;1) 0.16 0.21 0.19 0.23 0.21 
Û(R;;1) 3.42 1.33 0.21 0.44 0.19 
Rating 2 Śp (R::2) 0.56 0.22 0.10 0.08 0.04 
Sw (R:;2) 0.19 0.21 0.22 0.23 0.15 
L(Ri:2) 2.96 1.05 0.45 0.35 0.26 


‘The inversion of the likelihood ratios is not necessary. We are doing this here just for didactical 
reasons to ensure that low credit quality corresponds to low rating scores throughout this chapter. 
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ROC Curves (after Transformation of Scores) 


Hit Rate 


0.00 1 1 7 7 1 
0.00 0.20 0.40 0.60 0.80 1.00 
False Alarm Rate 
Seinen Rating 1 (LR) —---- Rating 2 (LR) 
—— Random Rating ---- Perfect Forecaster 


Fig. 13.8 ROC curves for the transformed rating scores of Rating 1 and Rating 2 


is concave everywhere after the transformation. As pointed out by Tasche (2002), 
the non-concavity of a ROC curve is a clear sign that the rating model does not 
reflect the information contained in the data in an optimal way. With this simple 
transformation, the quality of the rating model can be improved. 

A practical problem in the construction of rating models is the inclusion of 
variables that are non-linear in the credit quality of debtors (e.g., Fig. 13.6). As 
pointed out in Chap. 2, these variables can offer a valuable contribution to a rating 
model provided that they are transformed prior to the estimation of the rating 
model. There are several ways to conduct this transformation. Computing likeli- 
hood ratios and sorting them as was done here is a feasible way of producing linear 
variables from non-linear ones. For further details and an example with real data, 
refer to Engelmann et al. (2003b). 


13.3 Statistical Properties of AUROC 


In this section we will discuss the statistical properties of AUROC. We focus on 
AUROC because it can be interpreted intuitively in terms of a probability. Starting 
from this interpretation we can derive several useful expressions which allow the 
computation of confidence intervals for AUOC, a rigorous test if a rating model has 
any discriminative power at all, and a test for the difference of two rating systems’ 
AUROC. All results that are derived in this section carry over to AR by applying the 
simple relation (13.2) between AR and AUROC. 
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13.3.1 Probabilistic Interpretation of AUROC 


The cumulative distribution function of a random variable evaluated at some value 
x, gives the probability that this random variable takes a value which is less than or 
equal to x. In our notation, this reads as 


Cp(Ri) = P(Sp < Ri), 


(13.5) 
Cyp(Ri) = P(Snp < Ri), 
or in terms of the empirical distribution function 
Cp(Ri) = P(Sp < Ri), 
p(Ri) = P(S» < Ri) ae 


where P(.) denotes the probability of the event in brackets (.). 
In Appendix B, we show that AUROC can be expressed in terms of empirical 
probabilities as 


x Š 1 5 A 
AUROC = P(Sp < Sup) + 5P(Sp = Svo). (13.7) 


To get further insight, we introduce the Mann-Whitney statistic U as 


1 
U = —__ 5 UD ND, 
Np -Nwp (OND) 


1, if Sp < Sup (13.8) 


i ee R 
UD ND = 3? f Sd = Snp 
0, if Sp > Sno 


where Sp is a realisation of the empirical score distribution S pD and Syp is 
a realisation of Syp. The sum in (13.8) is over all possible pairs of a defaulter and 
a non-defaulter. It is easy to see that 


å pa 1 5 a 
U= P(Sp < Syp) + xP (Sp = Snp), (13.9) 


what means the area below the ROC curve and the Mann-Whitney statistic are 
measuring the same quantity. 

This gives us a very intuitive interpretation of AUROC. Suppose we draw 
randomly one defaulter out of the sample of defaulters and one survivor out of 
the sample of survivors. Suppose further we should decide from the rating scores 
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of both debtors which one is the defaulter. If the rating scores are different, we 
would guess that the debtor with the lower rating score is the defaulter. If both 
scores are equal we would toss a coin. The probability that we make a correct 
decision is given by the right-hand-side of (13.9), i.e., by the area below the 
ROC curve. 

Throughout this article, we have introduced all concepts and quantities with the 
data set given in Tables 13.1 and 13.2. However, the data set of Tables 13.1 and 
13.2 is only one particular realisation of defaults and survivals from the underlying 
score distributions which are unknown. It is not the only possible realisation. In 
principle, other realisations of defaults could occur which lead to different values 
for AUROC and U. These different possible values are dispersed about the expected 
values of AUROC and U that are given by 


E[|AUROC] = E[U] = P(Sp < SnD) + 5P(Sp = Syp). (13.10) 


To get a feeling of how far the realised value of AUROC deviates from its 
expected value, confidence intervals have to be computed. This is done in the next 
subsection. 

Finally, we remark that the AR measure can also be expressed in terms of 
probabilities. Applying (13.2) we find 


E[AR] = P(Sp < Snp) — P(Sp > Sno). (13.11) 


The expected value of AR is the difference between the probability that a 
defaulter has a lower rating score than a survivor and the probability that a defaulter 
has a higher rating score than a survivor. It is not so clear how to give an intuitive 
interpretation of this expression. 


13.3.2 Computing Confidence Intervals for AUROC 


To get a feeling for the accuracy of a measure obtained from a data sample, it is 
customary to state confidence intervals to a confidence level «, e.g., 7 = 95%. In 
the first papers on the measurement of the discriminative power of rating systems, 
confidence intervals were always computed by bootstrapping.” These papers mainly 
used the measure AR. Bootstrapping requires the drawing of lots of portfolios 
with replacement from the original portfolio. For each portfolio, the AR has to 
be computed. From the resulting distribution of the AR values, confidence inter- 
vals can be computed. The main drawback of this method is its computational 
inefficiency. 


SEfron and Tibshirani (1998) is a standard reference for this technique. 
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A more efficient method is based on the application of well-known properties of 
the Mann-Whitney statistic introduced in (13.8). The connection between AR and a 
slightly modified Mann-Whitney statistic is less obvious® than for AUROC which 
might be the reason for the inefficient techniques that were used in those early 
papers. 

From mathematical statistics it is known that an unbiased estimator of the 
variance a, of the Mann-Whitney statistic U in (13.8) is given by 


1 R ‘ 
a2 

= {P +(Np—1)- P 
7a Ap = 1) 0o] ( pnp + (Np — 1) Po, p.wp 


+ (Nyo — 1) - ÊND.ND,D —4-(Np +Nwp — 1): (U — 0.5)°), 


(13.12) 


where Ppznp; Pp pnp; and PND NDD are estimators for the probabilities P(Sp#SĘnp), 
Pp pnp, and Pyp, npp Where the latter two are defined as 


Pp,p,np = P(Sp,1,Sp2 < Sno) — P (Sp < Snp < Spz) 
— P(Sp2 < Snp < Sp) +P(Snp < Sp,1,Sp2); 
Pwp,np,b = P(Swo,,Snp2 < Sp) — P (Snp < Sp < Snp2) 
— P(Snp2 < Sp < Snp) +P(Sp < Snp, SND2), 


(13.13) 


where Sp,ı and Sp > are two independent draws of the defaulter’s score distribution 
and Snp, and Syp.2 are two independent draws of the non-defaulter’s score 
distribution. 

Using (13.12) confidence intervals can be easily computed using the asymptotic 
relationship 


AUROC — E[AUROC] Np, Nyp—00 


N(0, 1). (13.14) 
Ou 


The corresponding confidence intervals to the level « are given by 


1 1 
AUROC — va ( =*) Auroc + avo ( =*)), (13.15) 


where ® denotes the cumulative distribution function of the standard normal 
distribution. 

The asymptotic relation (13.14) is valid for large numbers Np and Nyp. The 
question arises how many defaults a portfolio must contain to make the asymp- 
totic valid. In Engelmann et al. (2003a, b) a comparison between (13.14) and 


“Tn (13.8) the % has to be replaced by 0, and the 0 has to be replaced by — 1 to get the corresponding 
Mann-Whitney statistic for the AR. 
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bootstrapping is carried out. It is shown that for 50 defaults a very good agreement 
between (13.14) and bootstrapping is achieved. But even for small numbers like 
10 or 20 reasonable approximations for the bootstrapping results are obtained. 

We finish this subsection by computing the 95% confidence interval for the 
AUROC of our examples Rating | and Rating 2. We start with Rating 1. First we 
compute Ppznp- It is given by the fraction of all pairs of a defaulter and a survivor 
with different rating scores. It is computed explicitly as 


i=1 


5 
Ppznp = 1 — Pp=np = 1 — (Ese St) 
= 1—(2-20045-215-+2-185-+ 14- 200 + 27 - 150) /(50-950) = 0.817 


The estimators for Pp p np and Pyp np,p are more difficult to compute than for 
Pp-znp. To estimate Pp pnp it is necessary to estimate three probabilities, P(Sp 1, 
Sp.2 < Snp), Ppa < Snp <,Sp,2) (which is equal to P(Sp.2 < Snp <,Sp,1)), and 
P(Snp < Sp,1,Sp2). We illustrate the procedure for P(Sp 1,Sp.2 < Snp). The other 
probabilities are computed analogously. 

A naive way to compute P(Sp j,Sp,2 < Snp) is to implement a triple loop, two 
loops over all defaulters and one loop over all survivors. For each triple, one has to 
check if the scores of both defaulters are less than the score of the survivor. The 
probability P(Sp.1,Sp,2 < Snp) is then estimated as the number of triples where this 
condition is fulfilled by the total number of all triples. However, this procedure is 
very time consuming when the number of survivors is large. It is much more 
efficient to exploit the sorting of the debtors in their score values. We get the results 


P(Sp,1,Sp2 < $n) = > opi — 1) - Snp (i), 
P(Sp, <Swo <Sp2) = X` Co(i- 1) - Swo(i) (1 — Co()), 
4 
P(Syp <Sp,1,8p2) = X Syo(i) + (1 — Cn(i))’. 


Similar estimation formulas can be derived for P(Syp.1,Swp.2 < Sp), P(Snp.. < 
Sp < Syp2), and P(Sp < Syp,1,Snp,2). Applying these formulas to the rating system 
Rating 1 we get 


554, P(Sp1 < np < Sp2;1) = 0.051, 
044, Pp.p.np = 0.497, 

069, P(Swp,1 < Sp < Syp231) = 0.046, 
507, Pyp.np.p = 0.483. 


D,1,9D2 < Snp;l 
ND < SD1, 5Dp2;1 


P( 

P( = 

P(Syp,1,Syp2 < Sp;l 
P(Sp<Swp,1, Sp 251 


Soo 
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Finally, we have all ingredients for (13.12) and compute the variance of U as 
67, = 0.001131. Finally we compute the confidence interval to the level 95% which 
results in [0.69573, 0.82754]. A similar calculation for Rating 2 leads to a 95% 
confidence interval of [0.66643, 0.80431]. We see that both confidence intervals are 
rather broad. This is due to the relatively low number of debtors in our example 
rating systems. 


13.3.3 Testing for Discriminative Power 


The 95% confidence intervals of the AUROC of Rating 1 and Rating 2 are far away 
from the value 0.5. This suggests that the discriminative power of both rating systems 
is statistically significant. To confirm this we apply a rigorous statistical test. 

The null hypothesis of our test is that a rating system does not contain any 
discriminative power. Under this null hypothesis, (13.12) can be simplified consid- 
erably. If a rating system has no discriminative power, the score distributions of the 
defaulters and the survivors are identical. We get the identity 


P(Sp # Sno) /3 = P(Sp,1,Sp2 < Snp) = P(Sp1 < Syp < Sp) 


= P(Syp <Sp,1,Sp,2) = P(Swp,1, Svp2 < Sp) (13.16) 
= P(Snp,1 < Sp < Snp.2) = P(Sp < SnD., Sn.) 


P( 
P( 


This leads to the simplified formula for the variance of the Mann-Whitney 
statistic 


P(D # ND) - (1+ Np + Nyp) 
12 - (Np — 1) - (Nap — 1) 


a= (13.17) 


If we make a two-sided test the p-value of this test given by solving (13.15) for 
one minus the confidence level «. This calculation results in 


U—0.5 
p-value = 2-2-0( = ). (13.18) 
ĉ 
The application of (13.18) with the variance of (13.17) leads to a p-value of 
8.23 x 107" for Rating 1. The corresponding value for Rating 2 is 5.36 x 107". 
This means both rating systems have a highly significant discriminatory power. 
This confirms our conjecture at the beginning of this subsection. 


13.3.4 Testing for the Difference of Two AUROCs 


Throughout this article we always considered two rating models, Rating 1 and 
Rating 2. We have seen so far that Rating 1 has a slightly higher AUROC than 
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Rating 2. The question arises whether this difference is significant from a statistical 
point of view. To answer this question, we discuss a test on the difference of two 
AUROCs that was developed by DeLong et al. (1988). 

Comparing the confidence intervals of the AUROC of Rating | and Rating 2, we 
find that they overlap widely. Therefore, we would suppose that there is no signifi- 
cant difference between both AUROCs. However, when comparing confidence 
levels only, we are neglecting correlations between both AUROCs. To carry out a 
rigorous Statistical test, we need in addition to the variances of both AUROCs, the 
covariance between them. 

The estimator for the covariance is more complex than the estimator for the 
variance. It is given by 


1 
Õu, U = PE Np — 1) -ÊE 
OU; Ur 4- (Np — 1) (Nap 1) | D,D,ND,ND T (Np ) D,D,ND 


+(Nwp — 1) - Pip np p — 4° (Np +Nwp — 1) - (Ui — 0.5) - (U2 — 0.5)}, 
(13.19) 


512 512 512 ` ig 
where PD D.ND.ND? Pi pND? and Pypnp.p are estimators for the probabilities 


PD NDND» Ppp: and Pe which are defined as 
PS pai = P(Sp > Sip Sh > Shp) + P(Sp < Shp: Sp < Sip) 
—P(Sp > Shp: Sp < Sp) = P(S < Sip SD > Sip), 
P DND = P(Shs > Sip: Sb2 > Sa) +P(Sh, < Sip: Sb2 <ia) 
= P(Sb > Sip Sha <S) = P(Sb, < Sip Spo > Sip) , 


PN NDD = p(s > SND , Sh > Spa) + P(Sp< Sipi Sp < Sioa) 


(13.20) 


= P(o > Sp. „Sh < Srna) - P(S} < SND Sp > Sx) 


where the quantities S}, Sh, E Sha are independent draws from the score distribution 
of the defaulters. The index i indicates whether the score of Rating 1 or of Rating 
2 has to be taken for this defaulter. The meaning of Sip, oe Svp2 is analogous 
for the score distributions of the of non-defaulters under Rating 1 and Rating 2. 

Under the null hypothesis that both AUROCs are equal it is shown in DeLong 
et al. (1988) that the test statistic T which is defined as 


2 
T= (Ui — U2) 


=- = = : 13.21 
ôy, + 6%, — 26u,,u, ( ) 


is asymptotically X distributed with one degree of freedom. This asymptotic 
relationship allows us the computation of critical values given a confidence level «. 
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We finish this section by computing the p-value of the test for difference of the 
two AUROCs of Rating 1 and Rating 2. The variances for both AUROC values 
have already been computed in Sect. 13.3.2. It remains to compute the covariance 
between both AUROCs. We show explicitly how to compute estimators for 
PE onpa: and P pyp. The estimator for Pip vp p is computed in a similar way 
as for PRs yp- 

We start with the computation of PP NoNo To compute this estimator, four 
probabilities have to be calculated from the sample. Consider the probability 
P(S} Ses > Sy) A naive way to calculate this probability would be to 
implement a loop over all defaulters and a loop over all survivors. This probability 
is then given by the fraction of all pairs where both Rating | and Rating 2 assign 
a higher rating score to the defaulter. This can be done in a more efficient way by 
using the sorting of debtors in score values. The four probabilities needed for the 
computation of PL np np Can be calculated by 


K] 5 5 5 
P(S) > Sun Sh > Sin) = > > Sioi) AO > SPD, 
i=l j=l k=i+1 I=j+1 
. . N . 5 a 7 i-1 j-l . 
P (Sb < Shs Sb < Sin) = D2 > Sti.) > p (k, D) 
i=1 j=l k=1 1=1 
. N . . 3 5 . 5 j-l . 
P(St > $ip» Sp < Sip) = be) i 5 Sb (k, 1) 
i=1 j=l k=i+1 [=1 
` . N . 5 5 7 i-1 5 . 
P(Sp < Sip: Sp > Sip) = > Sw) i 5 p (k, 1) 
i=1 j=l k=1 [=j+1 


Evaluating these formulas with the data of Table 13.1 leads to 


P(S) > Sip: Sh > Shp) = 0.0747, “PSs < Sip, Sh < Shp) = 0.5314, 
P(Si > Sip: Sh < Shp) = 0.0357, PUI < Sip, Sh > Shp) = 0.0506, 


ees = 0.5197 


In the next step we consider the estimation of P D np: Again, four probabilities 
have to be estimated. A naïve way to estimate for instance, the probability 


P(Shi > Shp, Sh2 >S) is the implementation of a triple loop, two loops over 


the defaulters and one loop over the survivors. This probability is then estimated as 
the fraction of all triples where the first defaulter has a higher rating score than the 
survivor under Rating 1 and the second defaulter has a higher score than the 
survivor under Rating 2. A more efficient procedure is given by the formulas 
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5 5 
= 2 2 Swol i,j) - (1 — Ĉn(i;1)) + (1 — GaG 2); 


62 
P > Sip Sba > Shp 
i=1 


i=1 i= 


5 5 
P (Sha < Sho. 832 <8) = D> a Cole 151) Col = 122), 
i 3 


5 5 
> Sip, Spa < Shp DD Soli): (1 — Ĉp(i;1)) - Eo — 1;2), 


( 


5 
P $1 < Shp, 52 > Sp) = se JOG =1a) <= C502), 


An application of these formulas to the data of Table 13.1 leads to 


P(Sb1 > Sio-Sb2 > Shiv) = 0.0389, P(S}, <Shp. S52 <Shp) = 0.4949, 
P($41 > Sip, a t Shp) = 0.0620, P( 551 <Shp.552 >Shp ) = 0.0790, 
Pr vig = 0.3930. 


A similar calculation for P}) yp p leads to PM, yp p = 0.3534. Taking every- 
thing together and evaluating (13.21) leads to T = 0.57704. This corresponds 
to a p-value of 0.4475. This means that the difference in the AUROC values of 
Rating 1 and Rating 2 is not statistically significant. This result is not surprising 
given the low number of debtors. 


13.4 Correct Interpretation of AUROC 


In this section we want to give some guidelines on how to interpret AUROC 
values.’ When discussing rating systems, one is often confronted with the opinion 
that a good rating system should have some minimum value for the AUROC. 
Sometimes people are happy that their rating system has a higher AUROC than 
the rating model of others or a company wants to achieve an AUROC of x% during 
the next 5 years for its rating systems. 

In this section we explain why all these opinions and goals are unreasonable. 
Consider a hypothetical portfolio with identical debtors only, e.g., a portfolio of 
companies with identical balance sheets. No rating model has a chance to discrimi- 
nate anything in this situation because there is nothing to discriminate. This means 
that the AUROC does not depend on the rating model only, but also the portfolio. 


7See also Blochwitz et al. (2005). 
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This can be proven formally. Hamerle et al. (2003) show that for a portfolio of N 
debtors the expected AUROC is given by 


0.5 2 1 

E(AUROC) = 1-PD,+2-PD,)+-+--+NrPDy,) —PDp —— 
( ) pp, (aps! i+ 2+ +NrPDy,) P =) 
(13.22) 


where debtor i has a default probability PD; and the average default probability of 
the portfolio is denoted with PDp. Furthermore, it is assumed that the debtors are 
sorted from the worst credit quality to the best. 

A further example is provided. Consider two rating systems with two rating 
categories for different portfolios. They are given in Table 13.5. 


Table 13.5 Two rating 


; Rating category 
systems on different 


portfolios l 2 
Rating A Number of debtors 500 500 
Default probability 1% 5% 
Rating B Number of debtors 500 500 
Default probability 1% 20% 


We assume that both rating models are perfect, i.e., they assign the correct 
default probability to each debtor. Then we find for the expected AUROC values 


E|AUROC,] = 0.6718, 
E|AUROC3] = 0.7527. 


We see that there is a huge difference in the AUROC values in spite of the fact 
that both ratings are perfect. This demonstrates that a comparison of AUROC 
values for different portfolios is meaningless. 

The same applies to a comparison of the AUROC on the same portfolio in 
different time points. Because of changes in the portfolio structure over time, i.e., 
changes in the default probabilities of the debtors, the rating model is being 
compared on different portfolios. However, this analysis could be helpful in spite 
of this. If the AUROC of a rating model worsens over time, one should find out if 
this is due to changes in the portfolio structure or if the quality of the rating model 
has indeed deteriorated and a new estimation is necessary. 

We conclude that a comparison of the AUROC of two rating models is 
meaningful only if it is carried out on the same portfolio at the same time. It 
does not make sense to compare AUROCs over different portfolios or to try 
to achieve a target AUROC. As demonstrated in the example in Table 13.5, 
achieving a higher AUROC could require the inclusion of more poor debtors 
into the portfolio, a business strategy not every credit institution might want to 
follow. 
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Appendix A. Proof of (13.2) 


We introduce the shortcut notation Ci, = Cp(Ri), Cy and Ci. have a similar 
meaning. Furthermore, we denote the sample default probability by p. Note that 


Ci, can be written in terms of Ci) and Ci as 
Ĉi = p- L 40 = f) - Ĉip. (13.23) 


By computing simple integrals, we find for AUROC, ar + 0.5, and ap the 
expressions 


k 
AUROC = 5-05: (Ci, +. C5") - Ga: 
i=l 


(13.24) 


ar +0.5 = Sos . (Ĉĉ + 5!) - (Ĉĉ — ĈĉF!’), 
i=l 
ap = 0.5. (1 — P). 
Plugging (13.23) into the expression for ag + 0.5 and simplifying leads to 
k 
ar +0.5 = X 0.5- (Ch +Cp') + (Cr — Cr") 


i=l 


k 
= S505 - (65 +65"): 6 (Ĉ5 - 65") + (1—2) - Go - ĉin) 


k 
=(1-B)- 705+ (Cp + C5") - (Ĉio — Cro) 


i=l 


k 
+ê: X 0.5- (Co +C5") (Ôb - ĉ5') 


il 


k 
= (1 — f) - AUROC + 0.5 - f L(G) Gy) 


= (1 —p)- AUROC + 0.5 - ĵ. 


(13.25) 
Taking (13.24) and (13.25) together leads to the desired result 


ag (1— P) - (AUROC — 0.5) 


AR =— = 
ap 0.5 - (1 ka P) 


=2-AUROC — 1. 
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Appendix B. Proof of (13.7) 


Using the same shortcut notation as in Appendix A, we get 


= Yos - (P(Śp < Ri) + P(Sp < Ri-1)) - P (Snp = Ri) 


=X (P(Sp < Ri-1) +0.5-P(Sp =R;)) < P(Syv = Ri) 


= DUG < Ri-1) - P(Sp = Ri) + 0.5- > P(Sp = Rj) - P(Syp = Ri) 


i=l i=l 


= Pi, D < Snp) + 0.5- P(Sp = Snp) 


which proves (13.7). 
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Chapter 14 
Statistical Approaches to PD Validation 


Stefan Blochwitz, Marcus R.W. Martin, and Carsten S. Wehn 


14.1 Introduction 


When developing an internal rating system, besides its calibration, the validation of 
the respective rating categories and associated probabilities of default plays an 
important role. To have a valid risk estimate and allocate economic capital effi- 
ciently, a credit institution has to be sure of the adequacy of its risk measurement 
methods and of the estimates for the default probabilities. Additionally, the valida- 
tion of rating grades is a regulatory requirement to become an internal ratings based 
approach bank (IRBA bank). 

We discuss different methods of validating estimates for probabilities of defaults 
(PDs). We start by outlining various concepts used to estimate PDs and the assump- 
tions in rating philosophies including point-in-time and through-the-cycle appro- 
aches, a distinction necessary for a proper validation. Having discussed this, several 
common statistical tests used for the validation of PD estimates are introduced. These 
tests include the binomial test, the normal test and goodness-of-fit-type tests like the 
°-test. Also, the incorporation of descriptive measures linked to density forecast 
methods is discussed. For every test, the question of respective quality is raised. 
An alternative approach starts with the one factor model and gives an intui- 
tive validation tool, the so-called extended traffic light approach. We conclude with 
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a discussion of the approaches introduced, especially with respect to possible 
limitations for the use in practice and to their respective usefulness. 


14.2 PDs, Default Rates, and Rating Philosophy 


The meaning of “Validation of PDs” or backtesting in credit risk can be described 
quite simply. In the words of the Basel Committee on Banking Supervision, it is to 
“compare realized default rates with estimated PDs for each grade of a rating 
system and to assess the deviation between the observed default rates and the 
estimated PD,” (cf. Basel Committee on Banking Supervision (2004), § 501). 
Here, backtesting is defined as a statistical task which hopefully can be solved 
with the existing means and tools. However, performing such backtesting in 
practice raises some issues. Before we discuss the statistical means we want to 
draw readers’ attention to some more general aspects: 


e Recognition of defaults: Validation of PDs is fundamental in the recognition of 
defaults. A correct count of defaults is a necessary prerequisite for a correctly 
determined default rate, and the measurement of default events is the underlying 
concept of risk for determining PDs. A default of a borrower, however, is not 
objective event. On the one hand, there is the fact that a reasonable number of 
defaulted borrowers seem to have a considerable influence on the timing of the 
credit default. On the other hand, there is the observation that declaring a 
borrower as defaulted leaves room for judgement. Therefore, the definition of 
credit default is to a considerable degree, subjective, and even the new Basel 
framework retains this subjective element as the basis of the IRBA. However, a 
forward-looking focus and a limit of 90 days past due which is objective, is 
implemented into the definition of default, (cf. Basel Committee on Banking 
Supervision (2004), §§ 452 and 453). The requirement is that the definition of 
default — with all its subjective elements — has to be applied consistently to 
guarantee that the conclusions drawn from the validation of PDs are correct. 

e Inferring from default rates to PDs: A common and widespread approach for 
credit risk is the application of the law of large numbers, and to infer from the 
observed default rate the probability of default. An application of the law of 
large numbers would require that the defaults are independent and occur in the 
same distribution. This requirement cannot be seen to be fulfilled for different 
borrowers. To tell it in a picture: The difference for the task of determining the 
probability of throwing a six is like approximating this probability either by 
throwing the same dice 1,000 times and calculating the ratio of sixes to the total 
number of throws or throwing 1,000 dices once and calculating the ratio of sixes 
to the number of dices thrown. In any case, a proper application requires that 
borrowers are grouped into grades exhibiting similar default risk characteristics. 
Thus, the validation of PDs in most cases is preceded by grouping the bor- 
rowers to grades with the same risk profile (for an exemplary exception, the 
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Spiegelhalter statistics, cf. Chap. 15) This is necessary even in the case of direct 
estimates of PD, when each borrower is assigned an individual PD. 

e PDs in their context: An immediate consequence of the issues raised is that PDs 
have a meaning just in a certain context, namely in the portfolio. In our opinion, 
there is no such thing as an objective PD which can be measured with a rating 
system like temperature can be measured with a thermometer. Let us assume we 
rate the same borrower with two different rating systems: One with good 
discriminatory power resulting in different grades, which are assumed to be 
calibrated perfectly, and the other — very simple system — assigning all borrowers 
to the same grade, calibrated with the portfolio PD. Applying these two rating 
systems to the same borrower would result in different PDs; either in the PD of 
the respective grade or in the portfolio PD. However, both systems can claim to 
be right and there is no method of deciding what the “true” PD of that borrower 
might be. The example works exactly the same for two rating systems with 
similar discriminatory power and the same numbers of grades, providing both 
systems are calibrated with two different portfolios. Let us assume there is a 
subset of borrowers, which appears in both portfolios. If the remainder of the 
respective portfolios is different in terms of risk, then the same borrower in 
general will be assigned to grades with different PDs, and again, both systems 
can claim to be right. 

e Rating philosophy: Rating philosophy is what is commonly referred to as either 
point-in-time (PIT) or through-the-cycle (TTC) ratings. PIT-ratings measure 
credit risk given the current state of a borrower in its current economic envi- 
ronment, whereas TTC-ratings measure credit risk taking into account the 
(assumed) state of the borrower over a whole economic cycle. PIT and TTC 
mark the ends of the spectrum of possible rating systems. In practice, neither 
pure TTC nor pure PIT systems will be found, but hybrid systems, which are 
rather PIT or rather TTC. Agency ratings are assumed to be TTC, whereas bank 
internal systems — at least in most cases in Germany — are looked at as PIT. The 
underlying rating philosophy definitely has to be assessed before validation 
results can be judged, because the rating philosophy is an important driver for 
the expected range for the deviation between PDs and default rates. Jafry and 
Schuermann (2004) have introduced the equivalent average migration as a tool 
for assessing rating philosophy. According to Jafry and Schuermann (2004), the 
rescaled Euclidean—distance mobility metric is equal to the average migration, 
which describes the average number of borrowers migrating from one rating 
grade to another grade. This average migration gives an impression at which end 
of the spectrum a rating system can be found, if it is 0, then the rating system has 
no migration at all — a PIT system in its purest form — if it is 1, then on average, 
no borrower stays in a rating grade. To level off credit risk measurement for PIT 
systems as well as for TTC systems, the Basel Committee has clarified that 
estimation of PDs for regulatory purposes needs to include a forward looking 
element (cf. Principle 1 of Newsletter No. 4, Basel Committee on Banking 
Supervision 2005a). In practice, this would mean that for regulatory purposes in 
respect of risk quantification of their grades, PIT and TTC systems are a bit closer. 
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14.3 Tools for Validating PDs 


This section is devoted to a brief overview on statistical tests that can be performed 
to validate the so-called calibration of a rating system, i.e. the assignment of a 
probability of default (PD) to a certain rating grade or score value. 

In order to draw the right conclusions, in most cases — due to insufficient obligors 
or defaults to obtain reliable statistical implications — a purely statistical validation 
of a rating system is not sufficient to ensure the validity of the rating system. It has 
to be complemented by alternative qualitative approaches such as, e.g., shadow 
rating systems or plausibility checks by credits experts (cf. OeNB/FMA (2004, 
pp. 94), or Basel Committee on Banking Supervision 2004 and 2005b). 

Furthermore, we implicitly assume that the validation of the rating system’s 
discriminatory power and stability is to be also checked by a validation procedure 
which should be part of an integrated process covering calibration, discriminatory 
power and stability of the rating system (cf. Blochwitz and Hohl (2007), Tasche 
(2005, pp. 32), or OeNB/FMA (2004) which also includes some numerical exam- 
ples). For various techniques for calibrating rating systems we refer to Dohler 
(2010) and van der Burgt (2007). 

We describe the rating system to be validated as follows: Let N denote the total 
number of borrowers classified within a portfolio by application of the rating 
system. Moreover, N; denotes the number of obligors in this portfolio which were 
associated to the rating grade k € {1,...,K}. Hence, we have 


N= Ne 


Finally, let each rating grade be assigned a probability of default forecast PD}. 

The statistical tests presented in this section can be classified rather approxi- 
mately either by validation period (single- versus multi-period tests) or by the 
number of rating grades undergoing the test (single- versus multi-grade tests). By 
construction, TTC rating systems are based on much longer time horizons than PIT 
rating systems. Therefore, the validation methodologies set out in this section will, 
in practice, be more applicable to PIT rather than to TTC rating systems. 


14.3.1 Statistical Tests for a Single Time Period 


We start by considering tests that are usually applied to a single time period case, 
i.e. starting about one and a half years after the first introduction of a rating 
system and in the annual validation process that follows. 

The most prominent example for this kind of test is the binomial test (as well as 
its normal approximation) which is the most often applied single-grade single- 
period test in practice. On the other hand, the Hosmer-Lemeshow- or y -test 
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provides an example of a single-period multi-grade test that can be used to check 
the adequacy of PD forecasts for several rating grades simultaneously. 


14.3.1.1 Binomial Test 


To apply the binomial test, we consider one single rating grade over a single time 
period, usually 1 year. Therefore, we fix a certain rating grade by 

(B.1) choosing a fixed rating grade k € {1,..., K} throughout this subsection, 
and, additionally, 

(B.2) assume independence of default events between all credits within the 
chosen rating grade k. 

The last assumption readily implies that the number of defaults in rating grade 
ke {1,...,K} can be modelled as a binomially distributed random variable X with 
size parameter n:=N, and “success” probability p:=PD,. Thus, we can assess the 
correctness of the PD forecast for one time period by testing the null hypothesis 

H0: The estimated PD of the rating category is conservative enough, i.e. the 
actual default rate is less than or equal to the forecasted default rate given by the PD 

against the alternative hypotheses 

H1: The estimated PD of the rating category is less than the actual default rate. 

Thereby, the null hypothesis HO is rejected at a confidence level « whenever 
the number of observed defaults d in this rating grade is greater than or equal to the 
critical value 


d= minfa $ (%!)poj(1 oi)" <1 ah. 


According to Tasche (2005), the binomial test is the most powerful test among 
all tests at a fixed level and the true type I error (i.e. the probability to reject 
erroneously the hypothesis of an adequate PD forecast) can be much larger than the 
nominal level of the test if default events are correlated. 

In fact, assumption (B.2) is not realistic at all and turns out to completely disagree 
with all empirical experiences: In practice, default correlations in a range between 
0 and 3% do occur. The Basel II framework assumes asset correlation between 12 and 
24%. Despite this, we should particularly mention two recent results e.g.: For well 
diversified German retail portfolios, indications exist that asset correlations are in a 
range between 0 and 5% which in turn would imply that default correlations are even 
smaller fractions of these (cf. Hamerle et al. (2004) and Huschens and Stahl 2005). 

Therefore, one gets a realistic early warning tool using the binomial test and its 
rather complicated expression for the critical number of defaults. Another aspect 
worth considering is that one should rely on consistency between the modelling 
of correlation for risk measurement within the internally applied credit portfolio 
model on the one hand and the validation on the other to derive consistent 
confidence intervals. 
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14.3.1.2 Normal Approximation to the Binomial Test 


One possibility of obtaining an easier (but only approximate) expression for the 
number of critical defaults within a fixed rating grade k € {1,...,K},is to apply 
the central limit theorem: In short, we take advantage of the limiting properties 
of the binomial distribution and assume it approaches a normal distribution 
in the limit as the number of obligors N, becomes large (enough). Hence, 
we obtain 


dy, = Ny + PD, + ©} (0) - \/Ny- PDy- (1 — PDx) 


as a critical value where ®'(-) denotes the inverse of the cumulative standard 
normal distribution function. 

To apply this asymptotic approximation by the normal distribution, we neces- 
sarily have to ensure that the condition (sometimes also called Laplace’s rule of 
thumb) 


N; + PD- (1 — PDx) >9 


holds. In most cases of practical importance, the approximation seems to be valid 
already for not too large numbers of N, (while some numerical examples indicate 
that even for figures of N, as low as 50, the approximation works reasonably 
well). Note that for low default probabilities and a low numbers of credits in the 
individual rating classes, these prerequisites for using the normal approximation 
imply implausible high numbers of obligors. 

The same approach as the one used to derive the normal approximation to the 
binomial test was applied by Stein (2003) to get a lower estimate for the number of 
defaults necessary for validating the accuracy of the PD forecasts. Stein (2003) also 
discusses the question of sample size [closely related to the finite population 
correction by Cochran (1977)] as well as the influence of correlated defaults 
which we address in the following subsection, too. 


14.3.1.3 A Modified Binomial Test Accounting for Correlated Defaults 


The assumption of uncorrelated defaults (B.2) for the binomial test generally yields 
an overestimate of the significance of deviations in the realized default rate from the 
forecast rate. In particular, this is true for risk underestimates, i.e. cases in which the 
realized default rate is higher than the forecasted rate. Therefore, from a purely 
conservative risk assessment point of view, overestimating significance is not 
critical in the case of risk underestimates. This means that it is entirely possible 
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to operate under the assumption of uncorrelated defaults. Clearly, persistent over- 
estimates of significance will lead to more frequent recalibration of the rating 
model. In addition, this can have negative effects on the model’s stability over 
time. It is therefore necessary to determine at least the approximate extent to which 
default correlations influence PD estimates. 

Similar to the one-factor approach underlying the risk-weight functions of 
the IRB approach of Basel II, default correlations can be modelled on the basis 
of the dependence of default events on common (systematic) and individual 
(specific or idiosyncratic) random factors (cf. Tasche 2003 and 2005). For 
correlated defaults, this model also enables us to derive limits for assessing 
deviations in the realized default rate from its forecast as significant at certain 
confidence levels. 

On a confidence level «, the null hypothesis HO is rejected under the assumptions 
(B.1) and (B.2) whenever the number of observed defaults d in rating grade 


ke{l,..., K} is greater than or equal to the critical value 
page q(1—4) (1 -2p)® ' (1-4) - /p®~"(PDx) 
* 2N; ” VPO (1-2) +07 (PD,) 2N;-./p(1—p) 
p(1-p) 
where 


and p denotes the default correlation. This adjustment takes into account that due 
to unsystematic risk correlation with the systematic risk factor, the respective quan- 
tile lies a little further in the tail than without this further uncertainty and thus needs 
to be corrected. 

Tasche (2005) shows that assumption (B.2) is not robust for higher percentiles, 
i.e.: Small deviations from a zero correlation already lead to dramatic changes in 
the critical value of the test which is — of course — not a desirable feature of a test. 
Furthermore, Tasche (2005) concludes that taking into account dependence by 
incorporating a one factor dependence structure generated by a Vasicek dynamic 
and Gordy’s granularity adjustment, yield tests of rather moderate power. This is 
the case even for such low correlation levels as typical for the problem of corre- 
lated defaults. 

Clearly, the normal approximation is also applicable in this context and yields an 
easier expression for the critical number of defaults. 

Up to now, only single rating grades k were validated separately. The next test 
by Hosmer and Lemeshow will close this gap by an approach to validating more 
than a single rating grade simultaneously. 
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14.3.1.4 Goodness-of-Fit Type Tests: x7- or Hosmer-Lemeshow-test 


The two-sided Hosmer-Lemeshow-test provides an alternative approach in a single- 
period validation environment to check the adequacy of PD forecasts for several 
rating grades simultaneously. Recall that PD, denotes the PD forecast for rating 
grade k € {1,..., K}. 

For this purpose, let us pose the following assumptions: 

(LH.1) The forecasted default probabilities PD, and the default rates p := dx /Nx 
are identically distributed. 

(LH.2) All the default events within each of the different rating grades as well as 
between all rating grades are independent. 

Let us define the statistic 


s. S (Ne: PDy — dk) 
EO LO Ng PDg: (1 = PD,) 

with d} = py: N; denoting the number of defaulted obligors with rating kE {1,..., K}. 

By the central limit theorem, when N; — oo simultaneously for all ke {1, ... , K}, 

the distribution of Sx will converge in distribution towards a y°-distribution with K 

degrees of freedom because of assumptions (LH.1) and (LH.2). 

Again, a limiting distribution is used to assess the adequacy of the PD forecasts 
of the rating system by considering the p-value of a y7-test: The closer the p-value 
is to zero, the worse the estimation is. A further problem arises when the PD; 
are very small: In this case the rate of convergence to the 73--distribution may be 
very low as well. Furthermore, relying on the p-value enables under certain 
circumstances (e.g. comparability of underlying portfolios) a direct comparison of 
forecasts with different numbers of rating categories. 

The construction of the test is based on the assumption of independence and a 
normal approximation again. Therefore, the Hosmer-Lemeshow-test is also likely 
to underestimate the true type I error (as the binomial test). 


14.3.1.5 Brier Score 


Another method to validate a rating system across all rating grades is to calculate 
the average quadratic deviation of the forecasted PD and the realized default rates. 
Here, in contrast to the preceding statistical tests, it is about an exploratory method. 
The resulting score between zero and one is called Brier score (cf. Brier 1950) and 
is defined in the context of N debtors associated to the K rating grades by 


[a — PD)? + (N; — dx)PD}] 


Ne [a - PD’ + (1 — px)PD}] 
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where PD, denotes the probability of default assigned to each obligor in rating 
grade k and py = d;,/N, is the observed default rate within the same rating grade (cf. 
OeNB/FMA 2004). The closer the Brier score is to zero, the better is the forecast of 
default probabilities. 

Note that, by definition, the Brier score does not measure directly the difference 
of the default probability forecast and the true conditional probability of default. 
Hence, the Brier score is in fact not a measure of calibration accuracy alone. Since 
the Brier score can be decomposed as 


K K 


1 1 
B=p(l- Dp), So Ne(PDx =a = do Ne(P — pa)? 
k=1 k=1 


(cf. Murphy and Winkler 1992) whereby p = d/N, a separate analysis is in principle 
possible: 


e The first term describes the variance of the default rate observed over the entire 
sample. Here, PD denotes the default frequency of the overall sample. This value 
is independent of the rating procedure’s calibration and depends only on the 
observed sample itself. It represents the minimum Brier score attainable for this 
sample with a perfectly calibrated but also “trivial rating model”, which forecasts 
the observed default rate precisely for each obligor, but only comprises one rating 
class for the whole sample, i.e. PD = PD, = py = d)/N, for all KE {1,..., K}. 
In this case the expected Brier score is equal to the variance of the default indicator, 
i.e. the first of the three terms in the representation above, B = B := p- (1 — p). 

e The second term represents the average quadratic deviation of forecast and 
realized default rates in the K rating classes. A well-calibrated rating model 
will show lower values for this term than a poorly calibrated rating model. The 
value itself is thus also referred to as the “calibration”. 

e The third term describes the average quadratic deviation of observed default 
rates in individual rating classes, from the default rate observed in the overall 
sample. This value is referred to as “resolution”. While the resolution of the 
trivial rating model is zero, it is not equal to zero in discriminating rating 
systems. In general, the resolution of a rating model rises as rating classes 
with clearly differentiated observed default probabilities are added. Resolution 
is thus linked to the discriminatory power of a rating model. 


An additional caveat is the different signs preceding the calibration and resolu- 
tion terms. These make it more difficult to interpret the Brier score as an individual 
value for the purpose of assessing the classification accuracy of a rating model’s 
calibration. Moreover, the numerical values of the calibration and resolution terms 
are generally far lower than the total variance. 

One of the main drawbacks of the Brier score is its performance for small default 
probabilities. In this case the “trivial rating model” yields a rather small Brier score. 
By “trivial rating model” we mean that all debtors are assigned the realized default 
rates p, of the overall sample. In this case, the expected Brier score is equal to the 
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variance of the default indicator, i.e. the first of the three terms in the representation 
above, 


B=p-(1—p). 

Evidently, for p — 0 the Brier score also converges to zero. The only possibility 
of applying this score in a meaningful way is to compute the Brier score relative to 
the “trivial score” B since the absolute values are very close together for cases with 
few defaults. 


14.3.2 Statistical Multi-period Tests 


While the binomial test and the y7-test are usually restricted to a single-period 
validation framework, the normal test and the extended traffic lights approach are 
devoted to overcoming the assumption of independence inherent to most single-period 
tests by assuming a dependence structure throughout a time horizon of several years. 


14.3.2.1 Normal Test 


The normal test for a given rating grade k, is a multi-period test of correctness of a 
default probability forecast for a single rating grade. It can be applied under the 
assumption that the mean default rate does not vary too much over time and that 
default events in different years are independent. Mathematically speaking, the 
fundamental assumptions for the normal test are given by 

(N) The random variables PD}, = D,,/N,, that describe the forecasted probabilities 
of default for a single rating grade k € {1,..., K} over the years t€ {1,..., T} are 
independent with means j1,, and common variance o7 > 0. 

In this case, the central limit theorem can be applied to prove that the standar- 
dized sum SẸ with 


f 
3 (PDr: — Hix) 
Si =5 


on: VT 


will converge to the standard normal distribution as T — oo. Since the rate of 
convergence is extremely high, even small values of T yield acceptable results. 
Consequently, to apply the normal test to the PD forecasts PD% and corresponding 
observed percentage default rates u,,, one has to estimate the variance oz. The 
classical estimator 


T 
i ay Ho (Hie PD) 


t=1 
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is unbiased only if the forecasted PDs exactly match the default rates u,,. Other- 
wise, the classical estimator will be reasonably upwardly biased, hence one should 
choose 


T ees Z 
For |X! Hik T PD) A XO (ue — PDes) 


t=1 t=1 


instead. This alternative estimator ô? is unbiased under the hypothesis of exact 
forecasts, too, but less upwardly biased than the classical estimator otherwise. 

Now, we can test the null hypothesis 

HN: None of the realized default rates in the years t € {1,..., 7} is greater than 
its corresponding forecast PD, ;. 

Therefore, the null hypothesis HN is rejected at a confidence level « whenever 


N 
Sy >Z 


where z, denotes the standard-normal «-quantile. 

Note that cross-sectional dependence is admissible in the normal test. Tasche 
(2003, 2005) points out that the quality of the normal approximation is moderate but 
exhibits a conservative bias. Consequently, the true type I error tends to be lower 
than the nominal level of the test. This means that the proportion of erroneous 
rejections of PD forecasts will be smaller than might be expected from the formal 
confidence level of the test. Furthermore, the normal test seems even to be, to a 
certain degree, robust against a violation of the assumption that defaults are 
independent over time. However, the power of the test is moderate, in particular 
for short time series (for example 5 years). 


14.3.2.2 Extended Traffic Light Approach 


Dating back to the approval of market risk models for regulatory purposes, the idea 
of using a traffic light approach for model validation seems to be a considerable 
exploratory extension to statistical tests. In the Basel Committee on Banking 
Supervision (1996), for value at risk outliers produced by market risk models, a 
binomial test with green, yellow and red zones is implemented that leads eventually 
to higher capital charges against potential risks. 

Tasche (2003) picks up the idea of a traffic light approach for the validation of 
default probabilities. The basic idea is to introduce probability levels %iow = 0.95 
and Gish = 0.999 (note, that the exemplary levels are similar to Basel Committee 
on Banking Supervision 1996) with respective critical values Cjow and Chigh, 
that assure with the model used, that the ex post observed number of defaults 
exceeds the level cio, by only a probability of 1 — ow (and for Cpigh by probability 
1 — ich respectively). First, the modified binomial test is introduced as above. 
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Furthermore, the Vasicek model with asset correlation p, independent standard 
normal random variables X, and €,,...,¢,, and a threshold c is given by (see also 
Martin et al. 2006). 


Nx 
dk = 5 1(-;,](VPX + vV 1 — pi). 
El 


Now, to determine critical values, the choice of asset correlation is of crucial 
importance as the critical values are given for a level « by 


Cerit = Min{i : P(dp > i) < 1 — a} 


Two approaches are introduced, one based on a granularity adjustment and one 
based on moment matching, see above. It can be concluded that, for high values of 
asset correlation, the respective critical values change clearly. 

Blochwitz et al. (2005) propose an alternative approach for implementing a 
traffic light based judgment that does not need an explicit specification of asset 
correlations emphasizing the accessibility for practitioners. They use a heuristic 
approach to the validation of rating estimates and to identify suspicious credit 
portfolios or rating grades. 

Starting again with assumptions (B.1) and (B.2), the number of defaults can be 
determined to be binomially distributed. Using the results given in Sect. 14.3.1.2, 
they obtain 


PD,(1 — PD 
Pmax = PD, + Qo! (bin) PPT PP») 
k 


for some given level of confidence opin. A similar consideration for the one-factor 
model (cf. Vasicek (1987) among others) with asset correlation p yields 


«/PpD"(otassat) T 2 e) 
vI=p i 
The next step is to compare the second order error for the statistics of these two 


approaches. Using f. := 1 — a with Ppin and Passet as the values for the respective 
models, they derive: 


Pmax = o( 


PD: +07! (1 — Brin) | EOP = o( et a TE Eun 


Ng y1-p 


A comparison shows that for low levels of asset correlation covering many 
relevant situations, there is no significant difference in the second order errors. 
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Therefore, for good reason, the subsequent considerations can be based on the 
normal approximation. 

To compare the adequacy of eventually time changing forecasts for probabilities 
of default, the application is based on a relative distance between observed default 
rates and forecasted probabilities of default. Motivated by the considerations above 
and by taking into account the expression 


o(PD,, Nx) = \/PD,(1 — PDx)/Nt; 


Blochwitz et al. (2005) establish four coloured zones to analyse the deviation of 
forecasts and realisations by setting 
Green if p < PD, 

Yellow if PDy < pp < PD; + K’o(PDx, Nx) 
Orange if PD + K?o(PDx, Nx) < pp < PD, + K°o(PDx, Nx) 
Red if PD, EJ K°a(PDx, Nx) < Pr. 

The parameters K” and K? have to be chosen carefully as they strongly influence 
the results of the later application to a given data set. Practical considerations lead 
to the conclusion that the respective probability for the colours green, yellow, 
orange and red to appear should decline. But in contrast, K° should not be chosen 
too large as in the tail of the distribution, asset correlation influences results much 
more than in the centre of it. Hence, a proper choice could be K” = 0.84 and 
K° = 1.64, which corresponds to a probability of observing green of 0.5, observing 
yellow with 0.3, orange with 0.15 and red with 0.05. 

Being in the comfortable situation to include more than one period into the 
evaluation framework, a potential enhancement is the application to a multi period. 
Now, a labelling function is given by 


A[Lg, Ly, Lo, L] = 1000L, + 100L, + 10L, + L, 


A possible weighting function is 
QO[L;, Ly, Lo, Lr] = Pg Lg + Py Ly + Po Lo +P, Ly 


with L, denoting the number of observed green periods, L, the respective yellow 
number and so on and P,, Py, Po, and P, the associated probabilities (i.e. 0.5, 0.3, 
0.15, and 0.05 respectively). 

With the help of the weighting function, it is possible to assign a mixed colour 
for more then one observed period. By numerical analysis and by application to 
rating agencies’ data, it is concluded that for many relevant cases, the deducted 
extended traffic light approach gives clear indications for a review of the forecasts 
for probability of defaults. 

According to Blochwitz et al. (2004), it is also possible to apply a multi-period 
null hypothesis which is in fact a continuation of the null hypothesis as in the 
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normal test (HN): Reject the hypothesis at a level f if A[L,, Ly, Lo, Lr] < cg, 
where 


cg = max{c|P(AIL,,L,,L5,L,] < c)<1— p}. 


Numerical studies to check the robustness with respect to the adequacy of 
neglecting correlations show that the extended traffic light approach is a useful 
tool in the jigsaw of validation. 


14.3.2.3 Some Further Readings and Remarks 


In Chap. 5 a PD estimation method applicable even for low default portfolios is 
suggested. The main idea is to use the most prudent estimation principle, i.e. to 
estimate the PD by upper confidence bounds while guaranteeing at the same 
time, a PD ordering that respects the differences in credit quality indicated by 
the rating grades. Unfortunately, the application of the proposed methodology 
for backtesting or similar validation tools would not add much additional 
information, as the (e.g. purely expert based) average PDs per rating grade 
would normally be well below the quantitative upper bounds proposed using 
the most prudent estimation principle. 

Other approaches to estimating non-zero PDs for high-quality rating grades 
are based upon Markov chain properties of rating migrations matrices [cf. 
Schuermann and Hanson (2004) or Jafry and Schuermann (2004)]. Therefore, 
a qualitative study of the evolution of these transition matrices across several 
years can shed light on possible problems in a rating system. After all, we still 
lack reliable statistical validation methods for low default portfolios or high- 
quality rating grades. 

For further discussions concerning backtesting issues, refer to Frerichs and 
Löffler (2003) or Bühler et al. (2002) and the references therein. 


14.3.3 Discussion and Conclusion 


All the above mentioned tests focus on comparisons between the forecasted prob- 
abilities of default and the afterwards observed default rates. For all statistical tests, 
the eventual correlation (i.e. asset or default correlation) between different obligors 
plays a crucial role and thus influences the possibilities for the use of the test in 
practice. Some tests neglect correlation, for others, it is necessary to specify it. It is 
common understanding, that to test correlation itself, the database is insufficiently 
comprehensive. Hence, it is highly important to keep in mind the respective 
assumptions used by the different tests. 

Further work can be done on integrating different rating categories into one test 
and with respect to the ranking of statistical tests for their use in practice. In the 
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validation process to be established in a bank, the use of the statistical tests and 
exploratory means introduced herein can thus only be one piece of the puzzle 
among others. 


14.4 Practical Limitations to PD Validation 


For several reasons, backtesting techniques of PDs, as described here, have their 
limitations: 


e Precision of measurement: Calibrating a rating system is comparable to measur- 
ing a physical property. If — as a rule of thumb in measurement theory — a 
standard deviation is taken as a reasonable size of the measurement error, the 
figures are rather disappointing. A lower bound for the measurement error of the 
k-th rating grade is given by the standard deviation of the uncorrelated binomial 
distribution. As a numerical example: Setting N, = 500 and PD, = 1% yields 
o(PD,;, Ny) = 0.45%, resulting in a relative error of measurement of 45%, which 
is an extraordinary high error compared to physical properties measured. This 
argument can be turned as well: If it is assumed, that the PD had been estimated 
precisely, then there would have been no surprise in default rates fluctuating with 
a standard deviation around the PD.! 

e Limited data: Backtesting relies on data. All statistical methods discussed here 
need a certain number of defaults to be observed before they can be applied. This 
challenge can be illustrated with a numerical example. For investment grade 
portfolios with PDs of less than 10 bps, a size of more than 1,000 borrowers is 
necessary to observe an average one default per year. These portfolios often are 
much smaller in size, and empirical evidence shows in most years no default at 
all. In these cases, backtesting would not provide any information, because 
neither evidence for a right calibration nor for an insufficient calibration can 
be found, because for PDs larger than zero, default rates of zero are observed. 
The implication of limited default data on the validation of rating systems and 
specifically on backtesting issues, are discussed in the Basel Committee on 
Banking Supervision (2005b). 

e Impact of stress: Rating systems are designed to work in “normal” times. In 
general they are calibrated to a more or less conservative estimated expected 
value of the PD for a longer time horizon. However, from time to time, 
unforeseeable events — often called “stress” — result in a sudden increase of 
default rates, which may be interpreted as a result of a sudden and likewise 
unforeseeable increase of PDs caused by that event. Usually, banks utilize credit 
risk models and the correlations modelled therein, yielding measures like Credit 
Value at Risk (CVar). In the Basel framework, this is implemented in the risk 


‘Under the settings of the normal approximation of the binomial test in Sect. 14.3.1.2 there is a 
more than 15%-chance, that the default rate exceeds the PD by more than a standard deviation. 
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weight function, which can be looked at as a kind of stressed PD: The expected 
value of the PD is “translated” by this function into a stressed PD, which is 
expected to appear once in 1,000 years, see Basel Committee on Banking 
Supervision (2005c). If PDs are estimated as expected values, then in periods 
of stress, any validation technique of PDs that compares a calibrated long run 
average PD to observed default rates will fail, because as a result of the stress to 
which the rated borrowers are exposed, the default rates will exceed that type of 
PD heavily. 


Further, when rating systems are backtested, two aspects need to be balanced: 
(1) One period tests make a statement about the current performance of a rating 
system’s calibration. However, this statement must be judged carefully, because it 
may be misleading for reasons already mentioned. (2) Multi period tests as suggested 
in this article provide a more robust statement about a rating system’s performance, 
but these tests have another drawback: They need a time series of 4 years at minimum. 
In 4 years time, however, a rating system has undergone some revisions, triggered 
by the experience a bank has collected by using the rating system. That’s why multi- 
period tests may infer using outdated information, and in the extreme, make a 
statement on a rating system which has ceased to exist. 

Our conclusion is that backtesting techniques as described here have to be carefully 
embedded into a comprehensive validation approach of rating systems. Validation of 
PDs should be the first element in a top down validation approach, since a successful — 
keeping in mind its limits — backtesting is just a necessary prerequisite for a well 
functioning rating system. Backtesting may reveal deficiencies in a rating system, but 
the final conclusion as to whether the rating system works as designed or not can be 
drawn only if the entire rating system is looked at. 
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Chapter 15 
PD-Validation: Experience from Banking 
Practice 


Robert Rauhmeier 


15.1 Introduction 


This chapter deals with statistical hypothesis tests for the quality of estimates of 
probabilities of defaults (PDs). The focus is on the practical application of these 
tests in order to meet two main targets. Firstly, bank internal requirements have to 
be met, assuming that PDs from bank internal rating systems are an essential 
element of modern credit risk management. Secondly, under the future regime of 
the Basel II framework, regular recurrent validations of bank internal rating systems 
have to be conducted in order to get (and retain!) the approval of banking super- 
visors for the purpose of calculating the regulatory capital charge. 

The theoretical findings are illustrated by an empirical validation study with real 
world rating data from bank internal models. We want to illustrate how validation — 
or more accurately, statistical backtesting — could be conducted with real world 
rating data in banking practice. 

We organised this article as follows. In the second section we describe briefly 
how rating systems are commonly used in the banking industry. Some basic 
notation is introduced in Sect. 15.3. In the fourth section, common statistical tests 
like the exact and the approximated binomial test, the Hosmer-Lemeshow test and 
the Spiegelhalter test, are discussed. These tests are suitable for testing the absolute 
quality of a rating system presuming that the final outcome of the analyzed rating 
system is a forecast of default probabilities. For comparing two rating systems — a 
further central issue in rating praxis — additional tests are required. In validation 
practice, these tests can be used to analyze whether using expert human opinion, 
which is usually applied subsequent to the pure machine rating, significantly 
improves the quality of the rating. The application of the tests discussed in this 
article is limited by assumptions, e.g., independence of the default events or high 
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numbers of obligors in order to fulfil the central limit theorem. Section 15.5 
presents some practical guidance to tackle these challenges by simulation techni- 
ques. Additional research on the issue, including which of the suggested tests 
performs best under certain portfolio compositions is presented. Furthermore, 
results on the analysis regarding the test power (p — error) under practical, near to 
reality conditions are shown. In Sect. 15.6, we introduce the concept of creating 
backtesting samples from databases found in banking practice. Section 15.7 illus- 
trates the theoretical considerations developed in previous sections by real world 
rating data and Sect. 15.8 concludes. 


15.2 Rating Systems in Banking Practice 


15.2.1 Definition of Rating Systems 


Firstly, we define the outcome of a rating system. In this article, a rating system 
forecasts a 1-year default probability of a (potential) borrower. It is not just a rank 
order of creditworthiness, nor an estimate of overall (expected) losses, nor the 
prediction of specific default events.' The latter means that we suppose that defaults 
are the realisation of random variables and a rating system consequently can at best 
forecast accurate probabilities for an event but not the event itself.” Secondly, it 
needs to be specified what is meant by a default. In this article and especially in the 
empirical example we refer to the Basel II default definition.* 


15.2.2 Modular Design of Rating Systems 


Often, bank internal rating systems are designed in a modular way, which is 
sketched in Fig. 15.1. The first module is often called ‘machine rating’, because a 
mechanical algorithm generates a first proposal for the borrower’s PD. Typically, 
this algorithm is based on statistical models as described in the initial chapters of 
this book. Usually this module is composed of a quantitative element, which 
consists of hard risk drivers (e.g., balance sheet ratios, legal form, gender, profes- 
sion, age) and a qualitative element consisting of soft risk drivers, which have to be 
assessed by the loan manager or rating analyst (e.g., management quality, competi- 
tiveness of the borrower). 


'We use the phrase forecast instead of estimation in order to emphasis that at the time the rating for 
a certain borrower is done, the regarding event, namely the default, is in the future. 


2We will come to this later in Sect. 15.3. 
See BCBS (2005a), §452 seqq. 
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Begin of rating 


procedure 
Module 1 Module 2 Module 3 Module 4 
Machine = Expert-guided | || Supporter i Manual 
Rating Adjustments Logic Override 


End of rating procedure 
Result: Forecast of the Default Probability 
for Borrower i expressed in rating grade 


Fig. 15.1 Modular Design of Rating Systems 


The second module, “expert-guided adjustments”, allows for the adjustments of 
the rating by the analyst subject to obligor specific details not or not sufficiently 
reflected in the “machine rating”. Usually this is done in a standardised form, for 
example, possibly by selecting predefined menu items and evaluating their severity. 
This is in contrast to the qualitative part of module 1, where the weights of the 
respective risk drivers are fixed by the algorithms and only the value has to be 
assessed (for example “good”, “average” or “bad”). In module 2, even the weight of 
the risk driver can be determined by upgrading or downgrading in full rating 
grades (Sect. 15.2.4). As an interim result, we obtain the stand-alone-rating of the 
borrower. 

Module 3 “supporter logic” captures effects arising from a potential backing of a 
borrower close to default. This module is especially important for rating systems 
designed for corporates and banks.* Here, often expert guided weightings of 
borrower ratings and potential supporter ratings are used, flanked with some 
reasonable guidelines. Like the first two modules, module 3 is also tightly standar- 
dised. These three modules have to be subsequently passed through and will result 
in a rule-based proposal for the PD. Since it is impossible to foresee every 
eventuality affecting the creditworthiness of a borrower in the model building 
process and the ultimate goal of the bank is to forecast the PD as accurately as 
possible for each individual borrower, the rating system might allow an override of 
the rule based rating. In our modular rating approach this refers to module 4 
“manual override”. Overrides should be of exceptional character and must be 
well documented, founded and approved by a senior management board. Additionally, 


‘Contrary to the extensive opinion, the term ‘supporter’ has not to be taken literally because the 
supporter could even have negative influence on the PD of the borrower. Further on all parties with 
strong direct influence on the PD of the borrower should be considered here. Popular is the 
influence of the corporate group where the regarding borrower is embedded, but also essential 
other (one-sided) dependencies could be taken into account. For example an automobile manufac- 
turer might support his most important supplier in case of an imminent default in order to ensure 
his own medium-term interests. 
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Basel II requires separate monitoring of overrides.’ Therefore, we suggest incor- 
porating monitoring of overrides into the annual validation process. Frequent 
reasons for overrides could lead to a refinement of the rule-based modules of the 
rating system. 

It has to be stressed that the detailed design of the sketched modular set-up of a 
rating system may strongly vary in practice and even one or more modules will be 
omitted if they are irrelevant, impractical or even too cost-intensive in relation to 
the expected benefits. A good example here is retail business with credit cards, 
where often the machine module is used exclusively. 


15.2.3 Scope of Rating Systems 


A rating model is a model of the real world process that generates default events. 
This process is called “default generating process” (DGP) and can be thought of as a 
function of various risk drivers The rating model takes into account only a limited 
number of selected key risk drivers of the DGP. Since borrowers of different 
portfolio segments follow different DGPs it is a consequence that there have to 
be as many different rating systems as portfolio segments to cover the whole 
portfolio.° But all rating systems have the same intrinsic aim, namely to forecast 
the 1-year-PD of a borrower as good as possible. With this in mind, the introduced 
backtesting methods are applicable in general for all rating systems as long as they 
are forecasting 1-year-PDs and realisations of defaults or non-defaults could be 
observed. Certainly, there are constraints regarding the number of borrowers (and 
the number of associated defaults).’ These constraints affect the significance of the 
results of the statistical backtesting, but not the methodology itself. 


15.2.4 Rating Scales and Master Scales 


It is common banking practice to use rating scales. This means that there is only a 
limited number of possible PD forecasts (associated with the corresponding rating 
grades) rather than a continuum of PD forecasts. Usually, there is a bank wide rating 
scale called a “master scale” which all rating systems are mapped into. An example 
of a master scale is illustrated in Table 15.1. 

The table is to be interpreted as follows. If the machine rating module, assuming 
a logistic regression model is used, produces a forecast PD of 0.95%, then it fits into 


>See BCBS (2005a), § 428. 


°Strictly speaking every borrower follows its own specific DGP but in practice borrowers follow- 
ing similar DGPs can be pooled into portfolio segments. 


7See Chap. 5 where low default portfolios are treated. 
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Table 15.1 Illustration of a 


Rating grade PD range PD of grade 
master scale 1 
4 rgi 0.11% 
8 0.80-1.40% 1.05% 
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Fig. 15.2 Typical Master Scale — exponential run of the curve 


the PD range of rating grade 8 and for the sake of keeping things simple, we round 
this forecast to 1.05% as it is the (geometrical) mean of the boundaries. 

We could interpret this kind of loss of measurement accuracy simply as round- 
ing-off difference. Using a master scale has certain advantages. For example, it is 
easier to generate reports and figures and for bank internal communication in 
general. Moreover, for some people it is easier to think in a few discrete values 
instead of a continuous measurement scale. This is especially relevant when adjust- 
ments of ratings coming from the pure machine rating within module 2 are com- 
pleted by upgrading or downgrading rating grades. But there are obvious pitfalls 
accompanying the use of a master scale which arises from solely thinking in rating 
grades and neglecting the fact that these grades are just proxies or aliases of forecast 
PDs. For instance, downgrading a borrower from grade 4 to grade 8 does not mean a 
doubling of the PD. Because of the exponential relationship of grades and 
corresponding PDs, this means nearly a tenfold increase in the forecast PD. 

As seen in Fig. 15.2, master scales often have the attribute that the PD according 
to the rating grades increases roughly exponentially.® Two reasons may explain 


Thinking in a logarithmic world, In(PD) of the master scale grows almost linearly in the grades. 
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this. First, master scales are sometimes inspired by the scale of the rating agencies 
and the derived default rates for these grades. Second, banks want (and supervisors 
claim, see BCBS (2005a), § 403) to have a meaningful distribution of their 
borrowers across their rating grades. 

As noted above, a master scale is used group wide. Rating grades of the master 
scale mean the same across different portfolios. For example a rating grade 8 means 
the same — namely a forecast PD of 1.05% — no matter whether it is assigned to a 
large corporate or a retail customer. Additionally we assume that rating grades of 
the master scale mean the same across time. A rating grade 8 means the same no 
matter if it is assigned in 1998 or 2005. This definition is often referred to as Point- 
in-Time (PIT) rating approach. 


15.2.5 Parties Concerned by the Quality of Rating Systems 


In general we can distinguish three groups of stakeholders of a bank’s internal 
rating system as illustrated in Fig. 15.3. 

First of all, there is the supervisory authority with the main objective of ensuring 
the stability of credit markets and financial markets in general. Therefore, the 
solvency of the bank itself has to be assured. Transferring this intention to the 
field of testing the quality of rating systems supervisors will accept forecast PDs 
that are too high compared to the true PDs. But they will intervene, if the default 
risk is significantly underestimated. But supervisory authority tends to follow a 
rather conservative approach which is understandable from its position. 

The opposite holds for the (possible) borrower, who is interested in low interest 
rates and favourable credit conditions. Assuming the price for the credit or at least 
the credit accommodation itself depends on the PD (beside the other risk parameters 
LGD and EAD), the borrower calls for a low PD assessment. So an underestimation 
of his PD is all right for the borrower, but an overestimation of his PD is not 
acceptable from his point of view. 


Financial ; 
Borrower Institution / Supervisory 
Bank Authority 

=- Favourable conditions, = Optimisation of capital allocation, = Assurance of financial 

low interest rates pricing, maximisation of profits stability in financial markets 
» Liberal approach: » Accurate Risk Assessment » Conservative approach: 

“Assess my Risk as low » (systematic) under-/ “In case of doubt a bank has 

as possible” overestimation of risk: to estimate risk parameters 

“bad borrowers” are attracted, in a conservative manner” 


“good borrowers” are lost 
= In the long run, banks with the most accurate rating system will prevail at the credit market! 


Fig. 15.3 Parties concerned by the quality of rating systems 
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The third party is the bank itself. Each kind of misjudgement of the creditwor- 
thiness harms an optimal capital allocation, a good pricing system, and, in conse- 
quence, the maximisation of profits. Therefore, we conclude that neither an 
underestimation nor an overestimation of risk is satisfactory. In terms of statistical 
test theory, supervisors and borrowers would perform one-sided statistical hypoth- 
esis tests whereas the bank prefers two-sided tests. 

We introduce some notation in the next section and describe the theoretical 
framework. 


15.3 Statistical Framework 


We have obligors 7 = 1,..., N each with a true, but unfortunately unknown proba- 
bility of default z; € [0;1]. The main intention of the rating system is to forecast 
each 7; as accurately as possible. We denote a forecast by 7t;. 

We want to start with the description of the theoretical framework of the default 
generating process (DGP). Therefore, we mainly refer to the well known model of 
categorical regression in its variations, logistic regression or probit regression. 
These topics are explained in detail in Chap. 1. 

The standard method used to describe a binary outcome variable y; depending on 
one or more variables x; is the categorical regression model. The model equation is 


n(x) = P(y; = 1|x:) = F(¥'B) (15.1) 


The outcome variable y; takes the value y; = 1 if a default is observed and y; = 0 
if a non-default is observed. In the vector x; all kinds of risk drivers are included. 
These may be financial ratios, obligor specific indicators like age or status of 
marriage, macroeconomic risk factors like GDP-growth-rate or interest rates, and 
even variables describing trends in industrial sectors. These variables mainly 
depend on the specific segment of obligors that is considered and on the data that 
is in general available for this segment.’ Note that in (15.1), the probability of 
default for obligor i, 2;, is the outcome of the model and not the forecast of the 
outcome event itself. Therefore, it fits perfectly into our basic understanding what a 
rating system should do as described in Sect. 15.1. The probability that obligor i 
gets in the status non-default is simply 


1 — r(x) = P(y; = O|x;) = 1 — F(x':B) (15.2) 


°In more sophisticated models like panel models or hazard rate models (see Chap. 1) the time 
index ¢ has to be incorporated beside index i in order to account for the time dependency of the risk 
drivers. In rating practice it is often assumed that the risk drivers in x are time-lagged (e.g. x11) 
explaining the default of borrower i in t. For the reason of keeping things simple we neglect this 
time-series component in this chapter. 
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The specification of the cumulative distribution function F(.) denotes whether 
we assume a logistic model 


ex iB 
mi(xi) = P(yi = 1|xi) = Ta eB (15.3) 
or a probit model. 
(xi) = PO: = 1x) = D(x'B) (15.4) 


where ®(.) denotes the cumulative standard normal distribution function. Other 
specifications for F(.) exist. 

Often x’; B is called linear predictor or simply score. The vector B consists of the 
weights for the risk drivers in x used to obtain the score. Because F(.) represents a 
cumulative distribution function, a monotonic relationship between x’ ;B and 7; is 
assured. 

Some conceptional background should explain (15.3) and (15.4), the models of 
the categorical regression: Suppose that behind the observable dichotomy of the 
depending variable y; there is a non observable, meaning latent, continuous vari- 
able y;. The value of y; depends on the value of the risk drivers x;. If the latent 
variable f; falls below the also latent threshold 6; the status y; = 1 is observable, 
otherwise the status y; = 0 is realised: 

yi = 1 © ĵi =x;B+ À 
y=OSy; =x'i B+ E 


M M 
V JA 


0; 
0; 


(15.5) 
i 
The error term é; allows for randomness and is needed to account for idiosyn- 


cratic risk factors not covered in x;. The random error term é; follows a cumulative 
distribution function F(.) and it is found 


ni(xi) = P(y; = Lxi) = PO; < 6;) 


2 z (15.6) 
= P(či < 0; = xB) = F(0; = x' iB) = F(6) 

The latent threshold 0; can be combined with the constant fo in B and we obtain 
our starting point equation (15.1). Depending on the cumulative distribution func- 
tion that is assumed for é;, a logit (15.3) or probit (15.4) model is obtained. 

Further on, we will restrict ourselves to the standard normal distribution func- 
tion. For example for a borrower i with a rating grade k = 8 — accompanied with a 
probability of default 7; = = 0.0105 — we will acquire 


Oir- = © '(0.0105) = —2.3080. 


So 0: is determined by the PDs of the master scale grades. 
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As anext step, we want to extend the model in order to integrate the possibility 
of modelling dependencies in the DGP. A widely used approach is the one-factor 
model'° which is also the basis of the Basel II formula for the risk weighted assets. 

We split up the error term é; in equation (15.5) in the components ¢; and f and get 


y= 1S j} =x¥B+ VP-f+V1—p-a <6, 
yi=OSy=xXB+ Jp f+V1—p- a> 
where f ~ N(0,1) and ¢; ~ N(O,1) are normally distributed random variables with 
mean zero and standard deviation one. The random variable ¢; represents the 
idiosyncratic risk and f represent the so called systematic risk. It is assumed that 
idiosyncratic risk and systematic risk are independent and idiosyncratic risk is 
independent for two different borrowers. Therefore, the integration of the system- 


atic factor f, models dependencies in the DGP of two borrowers and p is called the 
asset correlation’: 


o? = Var(5) = (va)? + (Vie) =1 
az= Cov(5i,5,) = (vp) =p 


or( 55) 


Var(y;) - var(5;) ma 


(15.7) 


(15.8) 


Pij = Corr (33) = 


Conditional on the realisation f of the common random factor, the (conditional) 
default probability becomes 


Tin f) = P(yi = 1|xi,f) = P(Y; < 6;) 

p(xiB B+vpe:f+vl-pPp: & <6)) 

(« ai R ve) (15.9) 
vI=p 


-o ði — VP: Í 
Vl-p 
Up to now, this detailed annotation may seem to be purely academic, but we will 
see its practical benefits in Sect. 15.5 where we extend the (standard) statistical 


hypothesis test being introduced in the following section by using this simple but 
very useful model variant in order to account for dependent default events. 


II 
Y 


\ A 


10See for example Finger (2001). 


"The asset correlation can be transformed in default correlations as shown in several papers, see 
e.g. BCBS (2005b, Chap. HI). 
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15.4 Central Statistical Hypothesis Tests Regarding Calibration 


As should become apparent, the realisation y; = 1 or y; = 0, respectively, is the 
result of a random process (the DGP), which is expressed by including the random 
variable ¢; in our approach. This means that even if the parameters of the model £ are 
specified perfectly correct, some unpredictable randomness still remains. Hence it is 
clear, that a certain realization of the default event could not be forecast, because this 
would imply that the rating system could exactly predict the realization of the 
random variable ¢;. This situation could easily be compared to the well known 
random experiment of throwing a dice. Even if you know that a six-sided dice is not 
bogus, you cannot predict the result. The best you can specify is the probability of 
throwing a certain number, in this example this is 1/6. By analogy, the best a rating 
system can do is to forecast the probability of default most exactly for each obligor /. 

In the following, sometimes the term “calibration” is used. In our context 
calibration means a property of a rating system and not an action. The later 
interpretation as action — “to calibrate a model” — means to estimate the parameter 
of the (statistical) model, e.g., to estimate by means of OLS or a maximum 
likelihood estimator the coefficients in the equation of the logistic regression. But 
in this article “calibration” is more in the sense of “to be calibrated”. The phrase 
refers to the outcomes of the rating systems and is a property of the rating system. 
This means that each forecast probability of default is right: t; = n; Vi. Therefore, 
we introduce several approaches how to perform tests on calibration next. 


15.4.1 Binomial Test 


15.4.1.1 Exact Binomial Test 


Someone whose task is to validate the hypothesis whether the PDs predicted by a 
rating system are consistent with observed default events, will most likely perform 
the well known binomial test, as presented in standard statistical textbooks, as a first 
step. 

Suppose we have N, obligors in rating grade g, and all of them have the same 
(true but unknown) probability of default z,. If we assume that the realisations are 
independent from each other, (we will drop this constraint at a later stage), then the 
number of defaults in grade g, Ng, = 1, follows a binomial distribution with 


P(Ney=1|Ng,%) = ( ae ) es (lag) (15.10) 


gy=l 


Based on this, we could perform a statistical hypothesis test with the null 
hypothesis 


Ho : Tg = tte (15.11) 
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and the alternative 
Hı : Ig A Tg (15.12) 


where 7, denotes the forecast derived from the rating system. The test statistic is the 
observed number of defaults N,,—, and we reject the null hypothesis if the 
incidence of observing N,, — ı under Ho is too unlikely. What is meant by “too 
unlikely” is defined by the confidence level x. Knowing the distribution of N, y = ı 
under Ho we can calculate these critical region as 


Ney=1 < b(a/2) or Ngy-1 > b(1 — «/2) (15.13) 


where b(.)'* is the quantile of the cumulative distribution function of the binomial 
distribution B(N,,7,). 

Figure 15.4 illustrates an example with N, = 350 in rating grade 8. If we will 
observe at least 9 defaults or no default at all this is too unlikely under the null 
hypothesis. In this case we would reject the correctness of the null hypothesis 
knowing that we made a wrong decision with probability of « = 0.05. 


0.25 5 
0.2105 
4 0.1937 
0.20 EE 
0.1710 a= 0.05 

2 0.154 0.1423 Ho : T -g = 9.0105 
Fat Hy : eg -g # 0.0105 
E: : : 
o : 
a 0.10Ṣ- ce 0.0868 : 

0.05 4 0.0453 : 

olla 0.0206: 
a: oboen wast 
0.00 T T T T T T T T 


1 2 3 4 5 6 7 8 
Number of Defaults 


9 10 11 


Fig. 15.4 Illustrative binomial test with marked rejection areas 


"Tt has to hold for (2/2): B(b(«/2)|Ng; Te) < 4/2 <B(b(a/2) + 1|Ng; mg) and for b(1 — &/2): 
1 — B(b(1 — a/2) — 1|N;; mg) < &/2 < 1 — B(b(a/2) — 2|Ng; me). 
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15.4.1.2 Normal Approximation of the Binomial Test 


The normal approximation of the exact binomial test is often applied in practice, 
using the fact that the exact discrete binomial distribution converges to the normal 
distribution for increasing sample sizes. As a rule of thumb, this approximation may 
be sound if N, - z, > 10 and at the same time N, - ne : (1 — Te) > 10 holds. !° The 
number of defaults is normally distributed N, ~ N(Ng - Tg; Ng + Tg + (1—7)) and 
the test statistic has to be constructed as 


Ne ` Yg — Ng ` Tg 


Zbin = ~ N(0,1) (15.14) 


Ng - Tg (1— ng) 


and follows a standard normal distribution, where y, = N, y=1 /Ne denotes the 
observed default rate in rating grade g. Performing the two-sided hypothesis test, 
the critical values can easily be derived as the «/2 and 1—.«/2-quantile of the 
standard normal distribution. 


15.4.2 Spiegelhalter Test (SPGH) 


Up to now, we have presented very standard approaches. But these approaches have 
a shortfall, namely they are primarily suited for testing a single rating grade but not 
several or all rating grades simultaneously. 

Spiegelhalter (1986) introduced a further generalisation we call Spiegelhalter 
test (SPGH). Originally it was used in the context of clinical statistics and the 
validation of weather forecasts. 

The starting point is the Mean Square Error (MSE), also known as Brier Score'* 


N 
MSE = | N Qi- i) (15.15) 


representing the squared difference of the default (y; = 1) and non-default (y; = 0) 
indicators, respectively, and the corresponding default probability forecast a” 
averaged across all obligors. 

Obviously the MSE gets small, if the forecast PD assigned to defaults is high and 
the forecast PD assigned to non-defaults is low. Generally speaking, a small value 
of MSE indicates a good rating system. The higher the MSE the worse is the 
performance of the rating system (keeping other things equal). 


13This rule of thumb may vary depending on what statistical text book is consulted. 


14See Brier (1950). 


Da, = it, if obligor i is rated in rating grade g. 
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The MSE can be interpreted as a weighted average of independent Bernoulli 
distributed random variables. Spiegelhalter derived an approach which allows us to 
test whether an observed MSE is significantly different from its expected value or 
not. Again the hypotheses are 


Ho: a= Vi and Hı: not Ho (15.16) 
Then under Ho the MSE has an expected value of 


N 
E(MSE;,=%,) = Som: (1—7) (15.17) 


i=l 


and variance 
N 
Var(MSE;,=2,) = eo (1 — 2n,)? -mi (1 — ni) (15.18) 


It is obvious from (15.17) that the expected value of the MSE under the null 
hypothesis is greater than zero,' and a function of the true (but unknown) prob- 
abilities of defaults. Therefore the absolute value of the MSE is not a meaningful 
performance index of the rating system because its value is constrained by the 
quality of the rating system and the portfolio structure i.e., the true but unknown 
default probabilities. 

Using the central limit theorem, it can be shown that under the null hypothesis 
the test statistic 


_ MSE — E(MSEz,=%,) N Ni 


= 15.19 
Var (MSEy,=2,)°” j 1 ! ! 


follows a standard normal distribution and the familiar steps coming to a test 
decision have to be conducted. 

It can be shown that a forecaster (in our case the rating system) minimizes its 
expected MSE when he or she forecasts the probability of default for each obligor 
equal to its true default probability.'’ There is no way of improving the MSE by 
modifying the forecast probabilities away from the true probabilities. Thus it can be 


'©As long as we do not consider the special case of a deterministic DGP, where all true PDs are 
zero or one. 


See De Groot and Fienberg (1983). 
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stated that the MSE rewards honest forecasting. This is known as a proper scoring 
rule.'* 

As a special case of the SPGH statistic, namely if there is just one single 
probability of default in the entire portfolio, then the SPGH statistic Zs is exactly 
equal to the Z,;, of the approximated binomial test. "° 

The major advantage of the SPGH test over the binomial test is that with the 
former all rating grades can be tested simultaneously on the property of calibration 
within one step.” 


15.4.3 Hosmer-Lemeshow-x Test (HSLS) 


The same can be done with an approach introduced by Hosmer and Lemeshow 
(1980, 2000). Their test statistic has its origin in the field of categorical regression 
and is often used in the process of model finding as a performance measure for 
“goodness-of-fit”. 

The SPGH test penalizes squared differences between realised event indicators 
(default or non-default) and PD forecasts on an individual level.” In contrast, the 
basic idea of the Hosmer-Lemeshow test (HSLS) is to penalize squared differences 
of forecast default rates from realised default rates on a group level as could be seen 
from numerator terms in (15.20). 


G O — 8 
= oN, T3 (15.20) 


Originally the groups come from arranging individual forecasts into e.g., ten 
centiles or by using the number of covariate patterns in the logistic regression 
model. In this context, the groups are defined by the rating grades.” When using the 
HSLS test statistic as a means of backtesting, XAR is approximately y”-distributed 
with G degrees of freedom.”™ ** This can easily be seen because Xei, consists in fact 
of G independent squared standard normal distributed random variables if 


'8See e.g. Murphy and Dann (1985). 

See Appendix A. 

0Rauhmeier and Scheule (2005) show that by factorising the MSE more rating properties could be 
derived and how they influence Basel II capital. 

?1See (19). 

? Hosmer et al. (1988) allude to some approximation conditions, e.g. that in about 4/5 of all groups 
the expected number of defaults should exceed the number of five and in no group the number of 
defaults should be smaller than one. 

3G denotes the number of rating grades with N, > 1, i.e. with at least one obligors being rated in 
this class. 

?4When using the HSLS statistic as a measure of fit in the process of model finding, then we say 
“in-sample”, because the model estimation sample and the sample on which the measure of fit is 
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Ho: ng = tte Vg (15.21) 


holds. It can be shown that in an extreme case, when there is just one rating grade at 
all, the HSLS test statistic and the (squared) SPGH test statistic and the (squared) 
approximated binomial test statistic are identical. 


15.4.4 A Test for Comparing Two Rating Systems: 
The Redelmeier Test 


Up to now we have introduced approaches adequate for testing whether the final 
outcomes of the rating system — forecasts of PDs for each obligor — are statistically 
in line with their realisations. This is unquestionably the main objective of statisti- 
cal backtesting. But, more questions arise when dealing with rating systems in 
practice. One might be interested in knowing whether the quality of the rating 
system is significantly enhanced when e.g., using so called human expertise in a 
module subsequent to the machine rating module. This interest might arise from a 
purely statistical perspective, but in banking practice, the rating systems which are 
to be implemented and maintained, are cost intensive. These costs may include 
salaries for the rating analysts as well as IT-related costs for operating systems and 
data storage. 

First of all we want to stress that only a comparison of two or more rating systems 
by means of the same rating data is meaningful as mentioned in Sect. 15.4.2 
and in Chap. 13. This means the same obligors (by name) in the same time period 
and with the same default indicator definition have to be used.” Therefore, while it 
is in general not feasible to compare ratings across banks — one should think of 
business confidentiality and protection of data privacy — this may be done in the 
context of pooling,” or especially when comparing two rating modules of the same 
rating system of a bank. We may primarily attend to the latter. 

The basic idea of the approach introduced by Redelmeier et al. (1991) is to 
compare two MSEs calculated on the same data basis. A test statistic is derived 
which allows us to test whether the deviation of a realised MSE from its expected 
value is significantly different of the deviation of another realised MSE of its 
expected value derived by an other module on the same data basis. As described 
in Sect. 3.2 the module with the lower MSE is the better one. 


computed are identically. In this case the distribution is y* with G — 2 degrees of freedom. When 
using the HSLS statistic for backtesting, we say “out-of-sample”, because there is no observation 
coexistent in the estimation sample and the validation sample. 

>See Chap. 13 and Hamerle et al. (2005). 


°Here we mean cooperation of autonomous banks organized in a project structure with the object 
of gathering data in order to enlarge the common data basis by merging banks individual data 
bases. 
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The test statistic?’ is 


Zr = = (15.22) 


and follows a standard normal distribution under the hypotheses: 


Ho : E[(E(MSEm.) — MSEm) — (E(MSEm2) — MSEn2)] =0 and 


(15.23) 
H; : E[(E(MSEm) — MSEm) — (E(MSEn2) — MSEm2)| # 0 


Note that it only makes sense to compare two MSE derived from two modules 
when each module passes a test of calibration like the SPGH test for example. 
Otherwise, comparing two MSE with respect to the property calibration is useless 
knowing that at least one of the two modules is not fulfilling the premise to be in 
line with the forecasts. 

We stress that we do not pay attention to theoretical considerations on statistical 
tests regarding discriminatory power as presented in Chap. 13, but we use them in 
our empirical analysis in Sect. 15.7. 


15.5 The Use of Monte-Carlo Simulation Technique 


As mentioned previously, the statistical tests introduced to date are based on crucial 
assumptions like independent realisations of defaults and/or a large number of 
observations in order to ensure that the central limit theorem holds. Using a 
simulation technique, which is sometimes referred to as Monte-Carlo-Simulation, 
allows us to drop these limiting assumptions. Fortunately, the basic ideas of the 
approaches discussed in Sect. 15.4 could be taken up and be combined with the 
default generation process of Sect. 15.3. 

Furthermore, these techniques could be used to derive some results on the 
analysis regarding the test power (f — error) under practical, near to reality condi- 
tions. This is a fundamental concept in order to highlight the chance of a non- 
detection of a low quality rating system. 


?TSee Appendix B for details. 
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15.5.1 Monte-Carlo-Simulation and Test Statistic: Correction of 
Finite Sample Size and Integration of Asset Correlation 


The fundamental idea is to derive the distribution of the test statistic (e.g., SPGH Zs, 
HSLS y3,,) under the null hypothesis by simulation, that means by replication of a 
random experiment several times. If basic assumptions like independent default 
events or infinite sample size are not fulfilled, we can implement those circum- 
stances in our simulation process and substitute the theoretical test statistic (e.g., 
normal distribution in the case of the SPGH test), by the one obtained by the 
simulation. All test decisions are then based on the “new” simulated distribution 
of the test statistic. The more simulation runs are used, the more accurately the new 
simulated distribution can be determined. 

Our approach is very similar to the one in Balthazar (2004) and could be 
interpreted as an extension, as his focus was on tests for a single rating grade 
whereas we want to use tests for all grades simultaneously. 

Firstly, we consider the simulation under Ho: it, = ng: The simulation 
approach could be best illustrated in eight steps starting with (15.6): 


1. Calculate the threshold Oi = $! (z4) depending on which rating grade the 
obligor i is rated into (0; = 0; — x’ ;f3). Constitute the asset correlation p before 
the start of the simulation.’ 

2. Generate a realisation of the random variable f ~ N(0,1). This represents the 
common factor of the DGP, the systematic risk. 

3. For each obligor i = 1,..., N in the examined portfolio: generate a realisation of 
the random variable e; ~ N(0,1). This represents the idiosyncratic, unsystematic 
risk. 

. Calculate the value of y; under consideration of p. 

. Calculate whether obligor i defaults in this simulation run according to (15.7). 

. Calculate all the test statistics of interest. 

. Repeat steps two to six, say about 1 Mio times (i.e., 1 Mio simulation runs) and 
generate a simulated distribution of the test statistic (based on the simulated 
defaults). 

8. Having a simulated distribution of the test statistic, the rejection areas of the Ho 

can be calculated and by comparison with the observed test statistic value, a test 
decision could be derived for each test considered. 


NAAM A 


This approach permits a very flexible application because according to require- 
ments, several values for the asset correlation could be analysed with respect to 
their impact on the distribution of the test statistic. Secondly, the impact of the 
portfolio size may be studied but this is not our focus as in normal backtesting 
situations the portfolio is given. Nevertheless, someone might get a feeling for the 


*8We will discuss this point in detail later. 
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variance caused by low numbers of obligors and/or the impact of the supposed asset 
correlation p. 


15.5.1.1 The Simultaneous Binomial Test (Sim Bin) 


The above described eight steps are sufficient to generate the simulated X -HSLS 
test statistic and the simulated SPGH-Z,”” in order to backtest a whole rating system — 
all grades simultaneously — under more practical situations. Considering the exact 
binomial test a further challenge arises. Whereas the binomial test by means of the 
simulation has been extended for integration of correlation, (the number of defaults 
under the simulation scenario divided by the number of obligors in the rating grade 
generates the simulated test distribution), there still is the problem of using the results 
of the grade-wise conducted binomial tests for the backtesting of all grades simulta- 
neously. Our aim is to draw a conclusion for the whole rating system and not just for a 
single grade. 

The starting point of our consideration is the fact that for a rating system of 14 
grades and a binomial test done with « = 0.10, we have to expect that for 1.4 
grades, the correct null hypothesis will be rejected. Someone who assumes that a 
rating system is “good” only if the statistical test fails for no grade, is off the track. 

Therefore, we suggest a two-staged approach within our simulation framework 
when the binomial test is used. The two steps are: 


1. Generate the rejection areas for each grade individually (maybe regarding some 
correlation with help of the Monte-Carlo-simulation) on a certain « — level and 
conduct the test decision. 

2. Count the number of “grade-wise rejections” per simulation run (introduce a 
step 7b in our 8 step approach) and use them to generate the distribution of the 
“sum of grade-wise rejections”. When the 1 — «,,-percentile of this distribution 
is exceeded (i.e., the critical value) by the observed sum of rejections of the 
individual grade-wise test, the rating system as a whole would fail the quality 
check.*° Note that we perform a one-sided test in this second level. The reason is 
that, assuming very low numbers of grade-wise rejections indicates a high 
quality of a rating system and too many grade-wise rejections are a signal of a 
low quality rating system. 


?°We have to emphasis that the simulated HSLS test statistic is generally not 7’ distributed as well 
as the simulated Spiegelhalter test statistic is not standard normal distributed but for convenience 
we maintain the termini 7? and Zs. 

3°We use as to label the simultaneous binomial test. We point out that the « - level of the 
individual tests and the «,, - level of the distribution of the sum of the grade-wise rejections 
(simultaneous binomial test) need not to be the same value. 
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15.5.1.2 Remarks on the Adherence of the a-Level with Using the Exact 
Binomial Test 


We would like to point out that because of the discreteness of the binomial 
distribution, the « — level that is in fact being held is lower than the ascertained « 
would suggest. We call this phenomenon “effect of dilution”. Therefore, a binomial 
test is in general “too less conservative” as could be seen for example in Fig. 15.4 
where the probability of being in the non-rejection area (1—8 defaults) is 96.24% 
and therefore the real « — level is 3.76% which is evidently lower as the composed 
level of 5%. The (correct) null hypothesis is rejected in much fewer cases than 
expected. 

This is especially true for samples with a low number of borrowers. The effect 
disappears when the exact binomial distribution converges to the normal distribu- 
tion with a growing number of borrowers or to any other continuous distribution 
generated by simulation as described above. 

The effect of dilution intensifies when using the simultaneous binomial test in 
stage two as a discrete distribution is also used here (see e.g., Table 15.2). 


15.5.1.3 Simulation Study A: Impact of Portfolio Size and Correlation 


To demonstrate our theoretical findings above, we perform a small simulation 
study. We have three portfolios, each with the same relative distribution over the 
grades as shown in Fig. 15.5, but with different absolute size. We start with a small 
portfolio, with N = 200 obligors representing for example a portfolio of large 
corporates or financial institutions, next we have a portfolio of N = 1,000 acting 


Table 15.2 Results from the simulation study A, non-rejection areas, 1 Mio Runs, « = 0.05 


Portfolio size SPGH HSLS Identical Sim Bin* Exact Bin, g = 8 
decisions 

p N N,=3 Lower Upper Upper in %” Upper Lower Upper 

bound bound bound bound® bound bound 
0.00 200 22 =—L7951 2.1437 34.40 95.74 1 0.0000 0.0455 
0.01 —1.8448 2.4306 35.64 95.93 1 0.0000 0.0455 
0.10 —2.0505 4.7912 54.98 99.45 1 0.0000 0.0909 
0.00 1,000 110 — 1.8697 2.0403 33.05 95.48 2 0.0000 0.0364 
0.01 —2.5659 3.1637 36.21 96.09 2 0.0000 0.0364 
0.10 —4.6298 9.5255 93.89 97.51 2 0.0000 0.0455 
0.00 10,000 1,100 —1.9569 1.9665 28.91 95.39 2 0.0046 0.0164 
0.01 —6.1455 7.7193 65.90 98.05 2 0.0036 0.0200 
0.10 —14.0278 29.2670 527.55 97.50 4 0.0000 0.0391 


“Tn the first and the second step we used a « = 0.05 regarding the simultaneous binomial test 
PIn percent of the 1 million simulation runs 

“Marks the upper bound of the non-rejection area. For example in the first row (p = 0.00 and 
N = 200), simultaneous binomial test: If 2 or more grade-wise rejections are observed, the rating 
system as a whole would be rejected 

Exact binomial test for rating grade 8: If a default rate of more than 0.0455 is observed (more than 
22 - 0.0455 = 1 default) the null hypothesis can be rejected 
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Fig. 15.5 Distribution of the borrowers over the rating grades in simulation study A 


as an example for a portfolio of middle sized corporates and finally we analyse a 
portfolio consisting of N = 10,000 obligors which could be seen as a portfolio of 
small business clients. 

The distribution we applied is bell-shaped*! as could be seen from Fig. 15.5 with 
an average probability of default x = 0.0308 and z,’s according to the master scale 
of Sect. 15.2.4 (e.g., Te- = 0.0105). All tests are done with « = 0.05. 

In Table 15.2 the results of our simulation study are presented. We show the 
lower and upper bound of the SPGH for the three portfolio sizes and furthermore, 
for three assumed asset correlations p = 0.00, p = 0.01 and p = 0.10. For the 
HSLS it is sufficient that we show only the upper bound because the lower bound is 
fixed at zero. We also report in the column titled “Identical decisions” how often the 
SPGH and HSLS came to the same test decision as we want to analyse whether 
someone has to await different (and therefore confusing) test decisions when 
applying both tests. As we can see from our study, in 95 to >99%, the HSLS and 
SPGH reach the same test decision. 

In general, we can state, that when p increases, the distribution gets broader and 
therefore the bounds of the non-rejection areas move outwards. Especially for the 
exact binomial test and the simultaneous binomial test, this effect is somewhat 
diluted because of the discrete character of these distributions. 

When we look at the SPGH under p = 0.00, we clearly see how the approxima- 
tion to the standard normal distribution is improved when the number of observa- 
tions is increased. For N = 10,000 we get very close to the Zs = @~'(0.025) zx —1.96 


31Often in banking practice the master scale is constituted in the way that many obligors are rated 
in the grades in the middle of the master scale and fewer in the very good or very bad grades. 
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Table 15.3 Simultaneous 


: ‘ ote Number of p = 0.10 p = 0.00 

binomial test, portfolio size grade-wise rejections 

eee ee 0 86.1791 60.7638 
1 5.9460 30.8927 
2 1.6888 7.2325 
3 0.9566 1.0092 
4 0.6389 0.0952 
o) 0.5221 0.0067 
6 0.4445 - 
7 0.4462 - 
8 0.4679 - 
9 0.5172 - 
10 0.5810 - 
11 0.6455 - 
12 0.5681 - 
13 0.3122 - 


(lower bound) and Zs = ®~'(0.975) ~ +1.96 (upper bound) we expect. The same 
is true in principle for the HSLS but the convergence is much slower, as it holds 
y(0.95,14) ~ 23.68. 

What is interesting is that in the presence of asset correlation (pọ > 0.00), an 
increased in N leads seemingly not to a convergence of the boundaries to any value. 
Instead, when we extend from N = 1,000 to N = 10,000, the non-rejection area 
increases dramatically from [—4.6298; + 9.5255] to [—14.0278; + 29.2670] by 
p = 0.10. The same holds for HSLS and Sim Bin but not for the exact binomial test. 

Now, we turn to the Sim Bin as we reported the simulation details in Table 15.3. 
As stated already above, we expect using « = 0.05, a number of 0.05-14 = 0.7 
grade-wise rejections on average (expected value). Because of the effect of dilution, 
this value was not achieved as could be calculated from Table 15.3: For p = 0.01 
and N = 10,000, we get 0.57 whereas the effect of dilution is quite higher for 
p = 0.00, as we get just 0.49. Therefore, the effect of dilution on step one and step 
two is weakened when correlation is taken into account. 

We conclude this subject with the proposition that all of the three tests conducted 
within our simulation framework are appropriate for means of backtesting. It is 
somewhat a question of flavour which test is preferred for banks’ backtesting. We 
tend to suggest SPGH because of its “most continuous” distribution generated by 
the simulation. 


15.5.1.4 Remarks on the Asset Correlation 


As can be seen from Table 15.2, the extent of the asset correlation p has a very high 
impact on the distributions of the test statistics and therefore finally on the test 
decisions itself. We feel it is worthwhile to think twice which asset correlation to 
use. Though we do not want to describe how asset correlations can be estimated in 
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detail, we discuss some basic considerations regarding the right choice of asset 
correlations and its impact on PD validation. 

First of all, the asset correlations used in the backtesting of the bank’s internal 
rating model should be in line with the asset correlations used in other fields of the 
bank wide (credit) risk management systems as in the credit portfolio model. This 
guarantees a consistent bank wide risk assessment. 

In practice, asset correlations are often not estimated on bank internal data, but 
based on empirical studies on external data which serve as a guideline. For 
example, Hamerle et al. (2003) report that asset correlations in a point-in-time 
rating framework are in a range of roughly 0.01-0.02. This is slightly higher than 
assuming no asset correlation at all — the most conservative approach regarding 
statistical backtesting — but much lower than the asset correlations used in the Basel 
II framework. In the latter, the asset correlation depends on the corresponding 
exposure class and varies from p = 0.04 (exposure class: Qualifying Revolving 
Retail) over p = 0.15 (Residential Mortgage) up to p = 0.16 (Other Retail), 
p = 0.24 (Corporates, Sovereigns and Banks), and even p = 0.30 for High Volatile 
Commercial Real Estate. These Basel II asset correlations might not be taken as 
best estimators of asset correlations by nature, but rather are assessed by political 
regulatory concerns in the light of being conservative. 


15.5.2 Assessing the Test Power by Means 
of Monte-Carlo-Simulation 


15.5.2.1 Theoretical Background 


As mentioned above, a further application of the Monte-Carlo-Simulation is the 
assessment of the type II error or the pendant, called test power. Our aim is to derive 
an approach for getting an idea of how well our tests work with respect to the test 
power. In general, the power of a statistical hypothesis test measures the test’s 
ability to reject the null hypothesis when it is actually false — that is, to make a 
correct decision. 

Table 15.4 gives an overview of the possibilities of correct and incorrect 
decisions one can make with statistical hypothesis tests. 

The type II error (f-error) is defined as the probability of not rejecting Ho when 
in fact H, is right. The power of a statistical hypothesis test is defined as the 
probability of not committing a type II error. It is calculated by subtracting the 
probability of a type I error from one: power = (1 — f). 


Table 15.4 Types of test decisions and its consequences 
Test decision 
Ho Ay 
Reality Ho is true Correct decision Type I Error («-Error) 
H; is true Type H Error ($-Error) Correct decision 
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Fig. 15.6 Illustration of «-error and f-error with the exact binomial test 


Whereas we can control the «-error by setting « to a specific value (usually 0.01, 
0.05, or 0.10), we have no control of the f-error simultaneously. The reason is that 
the f-error depends on the hypothesis Hı. We will not go into theoretical details, but 
demonstrate it with an example. 

We refresh the example for the exact binomial test of Sect. 15.4.1.1 with 
Ho: Tg—g = 0.0105 and Hj: m,~g # 0.0105. With this pair of hypotheses, there 
are an infinite number of possibly alternative hypotheses. Therefore, we have to 
pick out one of these. For example, we can specify H1: Te- = 2-0.0105 = 0.0210. 
Thus, we can calculate the possibility of detecting a false Hy when the true PD of 
the grade is twice as high as predicted. 

The grey bars in Fig. 15.6 mark the distribution under H. The area outside of the 
non-rejection area of Ho (no default and at least 9 defaults) and under the H4- 
distribution determines the test power. 

In our example, we get a power of 0.3166. In general — ceteris paribus — the 
power of a test rises if 


e The number of borrowers rises, 
e The distance of values under Hp and H, (here the PDs) rises, 
e The « — level is raised. 


15.5.2.2 Simulation Study B: What is the “Best Test”? 


The concept of assessing the test power is obviously not restricted to the exact 
binomial test but applicable to other statistical tests and in particular, the SPGH test 
and the HSLS test and even the simultaneous binomial test. Furthermore, the 
concept works well in our framework which allows correlations. 
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In the following, we take the simulation framework of Sect. 15.5.1 and add more 
steps in order to analyze the test power. Now, steps one to eight have to be done 
under Ho and again under A. Finally the area outside the non-rejection area of Ho 
has to be calculated under the H, distribution.” 

The focus is twofold: 


e First, we want to analyse how the power reacts under certain conditions such as 
varying numbers of borrowers and/or asset correlations. 

e Second, we want to analyse which of our tests - SPGH, HSLS or simultaneous 
binomial — performs best. 


We call test A better than another test B if it has more power (a lower type II 
error), with respect to an alternative hypothesis H, but at the same time holds the 
assumed a-level.°** *4 

We emphasise that we do not want to carry out a stringent mathematical proof, 
but merely provide an initial glance within our simulation framework. 

This chapter is strongly orientated towards real banking practice and we con- 
tinue this approach in this subsection: We distinguish three modes which may serve 
as point alternative hypothesis H,: 


e Mode 1: a fraction 1 — q of all borrowers is assumed to be classified in the correct 
grade where the fraction q is randomly distributed over all rating grades. 

e Mode 2: all borrowers are graded up by s grades 

e Mode 3: all borrowers are graded down by s grades” 


Whereas Mode 2 and Mode 3 describe a systematic, monotonic error in the 
rating system, °° Mode 1 represents a mixture of incorrect ratings and might be the 
most realistic problem in backtesting rating systems. 

Table 15.5 shows the result of our simulation study. As expected, an increase in 
portfolio size leads, ceteris paribus, generally to an increase in power. This is true 
for the three tests and for the three modes regarded. Further on, an increase in asset 
correlation — leaving the portfolio size constant — decreases the power. 


We assume hereby again that the relative frequency resulting from the 1 million runs is a good 
enough approximation for the probability. 

33This is similar but - not identical - to the concept of “uniformly most powerful test”. A test is 
called a “uniformly most powerful test” to a level « if under a given initial situation it maximizes 
the probability of rejecting the Hp on all distributions or parameter values belonging to the 
alternative hypothesis H4. 

*4The latter is fulfilled automatically as we derived the boundaries if the non-rejection area within 
the simulation. 

Rating grade 1 (14) has an upper (lower) ‘absorbing boundary’ which means that a borrower in 
the first (last) rating grade remains in it and cannot become better (worse). 

°Within the master scale we use (see Sect. 15.2.4) the PD from one rating grade to the next worse 
grade increases by a factor between 1.75 and 2 depending on the specific grade. 
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Table 15.5 Results from the simulation study B, power, 1 Mio runs, « = 0.05 


p N SPGH HSLS Sim Bin 
Mode 1: q = 0.5 
0.00 200 0.1746 0.2408 0.1071 
0.01 0.1482 0.2345 0.1066 
0.10 0.0686 0.1595 0.1071 
0.00 1,000 0.7644 0.9987 0.9763 
0.01 0.4345 0.9954 0.9763 
0.10 0.1001 0.8239 0.9759 
0.00 10,000 >0.9999 >0.9999 >0.9999 
0.01 0.6839 >0.9999 >0.9999 
0.10 0.1111 0.9606 >0.9999 
Mode 2: all borrowers graded up by s = 1 
0.00 200 0.1927 0.0203 0.0015 
0.01 0.1863 0.0200 0.0016 
0.10 0.0036 0.0204 0.0016 
0.00 1,000 0.7605 0.0291 0.0139 
0.01 0.4697 0.0228 0.0138 
0.10 0.1369 0.0130 0.0138 
0.00 10,000 >0.9999 >0.9999 0.9996 
0.01 0.7510 0.6141 0.9996 
0.10 0.1543 0.0078 0.9996 
Mode 3: all borrowers graded down by s = 1 
0.00 200 0.3428 0.1699 0.1568 
0.01 0.2836 0.1719 0.1563 
0.10 0.1217 0.1385 0.1560 
0.00 1,000 0.9119 0.4875 0.4277 
0.01 0.5854 0.4275 0.4282 
0.10 0.1362 0.1905 0.4295 
0.00 10,000 >0.9999 >0.9999 >0.9999 
0.01 0.7771 0.8669 >0.9999 
0.10 0.1388 0.2212 >0.9999 


It is remarkable that when looking at the SPGH already at N = 1,000 and by 
p = 0.01 or lower for all three modes, a power near to or over 0.5 is achieved. But the 
picture is quite mixed when regarding the HSLS or Sim Bin. These two tests perform 
worse in comparison to SPGH especially for Mode 2 and a small portfolio size. 

Analysing the relative competitiveness of the SPGH, HSLS and Sim Bin the 
picture is not unambiguous. Regarding Mode 1, which stands for an interchange of 
obligors’ assessed rating, HSLS seems to be the best choice. SPGH outperforms 
when the systematic up-grade by one grade is analysed as an alternative hypothesis. 
Even the Sim Bin in some situations has the highest power. 

What can we learn from this simulation study about power and what are the 
consequences for practical backtesting? We conclude that unfortunately none of the 
statistical test we analysed clearly outperforms the others in all circumstances. For 
practical issues, all tests should be performed when an assessment of the probability 
of non-detecting a low quality rating system is required. 
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What is most important at all is that especially the higher management should be 
aware that there is?” a (perhaps significant) probability that in fact Hp is wrong, but 
the statistical tools did not reveal this. Our simulation approach can be interpreted 
as an instrument to fulfil this purpose. 


15.6 Creating Backtesting Data Sets: The Concept 
of the Rolling 12-Month-Windows 


Up to now we have shown some concepts for statistical backtesting, but when 
dealing with real data, the first step is always to create a specific sample on which a 
meaningful analysis can be carried out. 

In banking practice ratings are performed continually over the year, for instance, 
when a new customer must be evaluated, a credit line requires extension, new 
information (e.g., financial figures) concerning a borrower already in the portfolio 
comes up, or questions of any fields regarding the creditworthiness are recognised. 

We propose an approach for creating backtesting samples clearly in line with 


e The definition of what a rating is, namely a forecast for the 1-Year-PD. 

e What could be found in the IT-database at any point of time we may look into it. 

e The general concept a bank manages its credit risks including the calculation of 
Basel II risk capital. 


From these guidelines, it follows that whenever we look into the rating database 
we find the bank’s best assessment of the borrower’s probability of default for the 
next year. This is irrespective of how old the rating is at the time we look into the 
database. This is because when the bank has an inducement that when there is a 
noteworthy change in the creditworthiness of the borrower (its PD), the bank has to 
alter the rating immediately.** This means that a re-rating just once a year, for 
example whenever new annual accounts are available, might be not adequate in the 
case when other, relevant information regarding the PD in any form is made 
available. When there is no change in the rating, it remains valid and predicates 
each day the same, namely the forecast of the 1-year-PD from the day we found it in 
the database. 

In the same way, the second essential variable, the defaults and non-defaults, 
have to be collected. 

The termination of the backtesting sample is done according to the principle of 
reporting date. We call this approach “cutting slices” or “rolling 12-months-window” 
(compare to Fig. 15.7). 


37This is true even if the hypothesis Ho “The rating system forecasts the PD well”. could not be 
rejected at a certain level «. 


38See BCBS (2005a), § 411 and § 449. 
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Fig. 15.7 Concept of the rolling 12-months-windows — the backtesting slices 


We start with the first slice called “Q1/2004”, which begins at January 2004. We 
look in the database and find borrower A with rating grade 8. He was rated with 
grade 8 a few months before (and gets other ratings after First January 2004), but 
has grade 8 at the beginning of January 2004. Within the next 12 months (up to the 
end of December 2004) he did not get into default, this was indicated with a ©. He 
enters the slice “Q1/2004”’, as non-default and rating grade 8 (ya =0; 
7tg—3 = 0.0105). The second borrower B enters with grade 10 but as default, because 
he defaulted somewhere in the third quarter of 2004 indicated with $ (ya = 1; 
7-10). Borrower C was not found in the rating database at January 1, 2004 as he 
was rated for the first time just before the beginning of the second quarter 2004. 
Therefore he is not contained in slice “Q1/2004”. Borrower D enters with grade 12 
as non-default, because the default we observe is past the end of the 12 month period 
which ends by December 31, 2004. Borrower E is found in the database with a rating 
grade 5 but he ended the business connection with the bank (indicated by ©). 
Therefore it is impossible to observe if he has defaulted or survived within the 12 
month period. This observation for borrower E should be included in the slice “Q1/ 
2004” as a weighted non-default, where the proportion is calculated as the quota 
(number of months it has been observed)/12. A non-consideration or full consider- 
ation may cause biases. 

In the same way, the following slices have to be constructed. We show the 
compositions of the slices as a summary in the left side of Fig. 15.7. 
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For practical issues, ultimo data files can be used best. So for the slice “Q1/ 
2004”, we use the ultimo data files from December 2003. In Fig. 15.7 we present the 
slice on a quarterly basis but sample creation can also be done on a monthly basis. 
This has the advantage that some elements of monitoring are fulfilled and nearly no 
rating and default is lost. The only exception is when a rating changes within a 
month. Therefore, the initial rating was not seen in the ultimo data file. The same is 
true when a rating is completed and the rated borrower gets into default before he 
has passed his first end of month. We recommend analysing these special cases 
separately, for example regarding detection of fraud. 

When using the introduced method of rolling 12-month-windows, it is of 
concern that the slices greatly overlap. For a tuned (entries and exits are balanced, 
dates of rating compilations are evenly distributed all over the year) portfolio of 
borrowers with long term business relationship, two subsequent slices may overlap 
by about 11/12. As a consequence, we expect that we get often the same test results 
for two or more subsequent slices. We will see this in the next section, where we 
demonstrate our theoretical considerations by applying them to real world rating 
data. 


15.7 Empirical Results 


15.7.1 Data Description 


In this section, we demonstrate the application of our concepts to real rating data. 
The data used is part of a rating system introduced in the beginning of 2004 for 
small business clients in Germany.”” We analysed slices beginning in February 
2004 up to January 2005.*° So for backtesting slice “Jan2005”, we considered the 
defaults and non-defaults up to the end of December 2005. Here we can see that for 
backtesting a complete vintage of ratings, in fact a period of two years, is needed. 

The rating system follows mainly the architecture sketched in Sect. 15.2.2, and is 
composed of various parallel sub-models for the machine rating module. These sub- 
models differ according to whether there is a tradesman, freelancer/professional*! 
or a micro corporate to be rated. Micro corporates dominate with about 45% of all 
ratings, followed by tradesman (about 30%) and remaining freelancer and profes- 
sionals with about 25%. 

The basic structure of all sub-models contains approximately a dozen quantita- 
tive and qualitative risk drivers as it is usual for this kind of portfolio in banking 


3n order to avoid disclosure of sensitive business information, the data base was restricted to a 
(representative) sub-sample. 


4°For the construction of e.g. the slice ‘Feb2004’ we used the ultimo data store of 31st January 
2004. 


“Like architects, doctors, or lawyers. 
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practice. Within the second module, “expert guided adjustment”, up or down 
grading of the machine rating can be done. For micro corporates a “supporter 
logic” module is available. 

In our empirical analysis, we want to examine the slices “Feb2004” to “Jan2005” 
and in detail the comprehensive slice “Jan2005”. Altogether, more than 26,000 
different ratings can be analysed in the slices “Feb2004” to “Jan2005”. Whereas 
slice “Feb2004”, consists of little more than a hundred ratings because of the recent 
launch of the rating system, the numbers in the slices increase steadily up to more 
than 24,000 in “Jan2005”. 

Note that with our concept of rolling 12-months-windows, the slices overlap by a 
high degree. For example “Jan2005” and “Dec2004” have 88% observations in 
common, slices “Jun2004” and “Jul2004” about 75%. 


15.7.2 The First Glance: Forecast Versus Realised Default Rates 


When talking about the quality of a rating system, we get a first impression by 
looking at forecast default rates and realised default rates. Figure 15.8 shows that 
realised default rates vary between 2 and 2.5%, whereas the forecast PD under- 
estimates the realised default rate slightly for almost all slices. 

Furthermore, it can be seen that on average, the final rating is more conservative 
than the machine rating. This means that the “expert guided adjustments” and 
“supporter logic” on average lead to a downgrade of borrowers. This might be an 
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Fig. 15.8 Realised default rate versus forecast default rate by slice 
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interesting result, because in banking practice the opposite is often assumed. The 
line of thought is, rating analysts or loan managers are primarily interested in 
selling loans which is easier — because of bank internal competence guidelines or 
simply by questions regarding the credit terms — if the machine rating is upgraded 
by the expert. The “accurate rating” is often assumed to be of subordinate impor- 
tance for the loan manager. Here we have an example, which disproves this 
hypothesis. We will see whether this difference of machine rating and final rating 
regarding the quality of forecasts is significant or not in Sect. 15.7.4. 


15.7.3 Results of the Hypothesis Tests for all Slices 


As we are interested in whether the deviation of the final rating from the default rates 
is significant, we focus on the SPGH and the HSLS test. Table 15.6 shows the results. 

For p = 0.01, the SPGH rejects in no slice the null hypothesis of “being 
calibrated”, the HSLS rejects in two slices tightly. For the very conservative 
approach with p = 0.00, in some slices the null hypothesis has to be rejected for 


Table 15.6 Test decisions by slice, final rating, 1 Mio runs, « = 0.05 


Slice p SPGH HSLS 
Lower Upper Test Decision Upper Test Decision 
bound bound statistic bound statistic 

Feb2004 0.00 —1.5063 2.2255 0.2075 No rej. 25.3473 5.7982 No rej. 
Mar2004 —1.8380 2.0736 —0.3214 No rej. 27.7315 6.1607 No rej. 
Apr2004 —1.8948 2.0137 0.2490 No rej. 21.5598 6.8883 No rej. 
May2004 —1.9512 1.9780 0.9859 No rej. 21.3653 10.8339 No rej. 
Jun2004 —1.9549 1.9697 2.0617 Rej. 20.8402 17.1008 No rej. 
Jul2004 —1.9544 1.9697 1.3236 No rej. 20.6058 33.3231 Rej 
Aug2004 —1.9549 1.9673 2.0724 Rej 20.3097 67.6734 Rej 
Sep2004 —1.9626 1.9675 2.4033 Rej 20.3765 78.3339 Rej 
Oct2004 —1.9570 1.9691 2.1408 Rej 20.5659 68.2907 Rej 
Nov2004 —1.9575 1.9604 1.6973 No rej. 20.6235 70.2873 Rej 
Dec2004 —1.9592 1.9629 1.0893 No rej. 20.6672 78.3400 Rej 
Jan2005 —1.9569 1.9620 0.9927 No rej. 20.9511 96.3306 Rej 
Feb2004 0.01 —1.5063 2.3911 0.2075 No rej. 26.5294 5.7982 No rej. 
Mar2004 —2.2839 2.9406 —0.3214 No rej. 30.5962 6.1607 No rej. 
Apr2004 —3.1715 4.0670 0.2490 No rej. 29.4874 6.8883 No rej. 
May2004 —3.9862 5.1376 0.9859 No rej. 35.4975 10.8339 No rej. 
Jun2004 —4.7208 6.1255 2.0617 No rej. 43.0297 17.1008 No rej. 
Jul2004 —5.5315 7.2272 1.3236 No rej. 53.6896 33.3231 No rej. 
Aug2004 —6.2755 8.2214 2.0724 No rej. 65.1878 67.6734 Rej 
Sep2004 —6.9194 9.0275 2.4033 No rej. 76.8287 78.3339 Rej 
Oct2004 —7.5017 9.7802 2.1408 No rej. 90.2356 68.2907 No rej. 
Nov2004 —8.0797 10.5260 1.6973 No rej. 103.8628 70.2873 No rej. 
Dec2004 —8.6682 11.2619 1.0893 No rej. 119.0537 78.3400 No rej. 
Jan2005 —9.1811 11.9508 0.9927 No rej. 130.8062 96.3306 No rej. 
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SPGH and HSLS. The simultaneous binomial tests (results are not shown here 
explicitly), shows that even for p = 0.00, the null hypothesis in no slice could be 
rejected, indicating a good quality of the rating system, too. Note the different test 
decisions of the consulted tests SPGH and HSLS for some slices. 

From Table 15.6, we can also see how well the approximation of the SPGH to 
the standard normal under Ho, works as the number of ratings in the slices increases 
for p = 0.00. The same is true for the HSLS, when we take into account that only 
10 of 14 grades have a large number of observations” x (0.95,10) = 20.48. 
Secondly, we might find it impressive how broad the non-rejection area is when 
taking correlation into account, even when used for a very low asset correlation of 
p = 0.01. Notice that the non-rejection areas for p = 0.01 of SPGH, HSLS and 
Sim Bin, get even broader when the number of ratings increases, although the 
relative distribution of the borrowers over the grades only changes negligibly. The 
same phenomenon was observed in the simulation study A, Table 15.2. 


15.7.4 Detailed Analysis of Slice “Jan2005” 


Now we turn to a more detailed analysis of slice “Jan2005”, as we can observe up to 
now that the rating system passes our quality checks well. The distribution, not 
shown here explicitly, of the observations over the rating grades, is roughly bell- 
shaped, for example about 900 observations in grade 4, up to 4,500 in grade 8 and 
1,000 in grade 12. 

We can see in Fig. 15.9 that for three rating grades, the realised default rate is in 
the rejection area for the binomial test. Hereby we assumed p = 0.01. The realised 
default rate increases in the rating grades as is assumed and therefore confirms our 
previous impression of the rating system we obtained from the SPGH and HSLS. 

Next we analysed the power of our tests. As could be seen from Table 15.7, the 
high number of ratings leads to a high power of all tests in all analysed circum- 
stances. When assuming no correlation, the power is >0.9999 for each of the three 
tests. When assuming p = 0.01 we get, e.g., for the SPGH in Mode 3, a power of 
0.7548. This means that the SPGH — when in fact all borrowers should have got a 
rating one grade worse — would have detected this with a probability of about 76%. 


4?Ratings Grades 1 to 3 of the master scale are intended mainly for sovereigns, international large 
corporates and financial institutions with excellent creditworthiness and could only in exceptional 
cases be achieved by small business clients. The worst rating grade is assigned to a very low 
number of borrowers in the data base, what is comprehensible because the rated portfolio mainly 
consists of initial ratings, so potential borrowers with a low creditworthiness are not accepted by 
the bank at all and therefore do not get into the rating database. 


342 R. Rauhmeier 


Rating Grade 
Set fs ak vet at 
=- YN WoO PTE DN O O= NORA 


1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Rating Grade 


= = realised default rate 


Fig. 15.9 Realised default rates and exact binomial test by grades, slice “Jan2005”, 1 Mio Runs, 
p = 0.01, x = 0.05 


Table 15.7 Analysis of power, final rating, slice “Jan2005”, 1 Mio runs, « = 0.05 


p SPGH HSLS Sim Bin 
Mode 1: q = 0.5 

0.00 >0.9999 >0.9999 >0.9999 

0.01 0.7894 >0.9999 >0.9999 
Mode 2: all borrowers graded up by s = 1 

0.00 >0.9999 >0.9999 >0.9999 

0.01 0.6798 0.4888 0.5549 
Mode 3: all borrowers graded down by s = 1 

0.00 >0.9999 >0.9999 >0.9999 

0.01 0.7548 0.8201 0.8227 


To get a complete picture of the quality of the rating system and the regarded 
portfolio, we look at its discriminatory power.** Figure 15.10 displays the ROC- 
Curves for the machine rating and the final rating. For both rating modules, no 
discrepancies could be observed from the ROCs. We see that the ROC-Curve of 
final rating is always atop of the ROC-Curve of the machine rating, indicating an 
increase in discriminatory power when human expert assessment is brought into 


“For a definition of the measures ROC-Curve and AUROC and their statistical properties, we 
refer to Chap. 13. 
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Fig. 15.10 ROC-curve for final rating and machine rating, slice “Jan2005” 


Table 15.8 Machine rating versus final rating 


MSE AUROC Ho p-value 
Machine rating 0.0230 0.7258 MSE nach.rating = MSEgn rating <0.0001 
Final rating 0.0226 0.7450 AUROC nach rating = AUROC in rating <0.0001 


account. The AUROC of the final rating is therefore a bit higher (0.7450), than 
those of the machine rating (0.7258). 

As could be seen from Table 15.8, the AUROC and MSE of the machine rating 
and final rating differ significantly. For comparing the MSE, we used the Redel- 
meier test described in detail in Sect. 15.4.4.“ 

To draw an overall result, the rating system passes our quality checks very well. 
With the high number of ratings in the analysed portfolio, we would have been able 
to detect potential shortcomings, but we did not find any. As the system was 
introduced 2 years ago, this was the first backtest that was performed, and the 
more highly this good result is to be regarded. 


“4 As it was a prerequisite that the machine rating should pass a test on calibration we conducted the 
SPGH and the HSLS. We find that we could not reject the null hypothesis of being calibrated with 
p = 0.01, but we have to reject the null hypothesis with p = 0.00. 
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15.8 Conclusion 


In this chapter we dealt with validation of rating systems, constructed to forecast a 
1-year probability of default. Hereby, we focused on statistical tests and their 
application for bank internal purposes, especially in the Basel II periphery. We 
built up a simulation based framework to take account of dependencies in defaults 
(asset correlation), which additionally has the potential to appraise the type II error, 
i.e., the non-detection of a bad rating system, for optional scenarios. Hereby, the 
well known exact and approximated binomial test and the Hosmer-Lemeshow- 
va test are used, but we also introduced the less popular Spiegelhalter test and an 
approach called simultaneous binomial test, which allow the testing of a complete 
rating system and not just each grade separately. As it is important for banks to 
compare the quality of modules of their rating system, we also refer to the 
Redelmeier test. As for any applied statistical method, building test samples is an 
important issue. We designed the concept of “the rolling 12-months-window” to 
fulfil the Basel II and bank’s internal risk management requirements as well as 
using the bank’s IT-environment (rating database) effectively and is in harmony 
with our definition of what a rating should reflect, namely the bank’s most accurate 
assessment of the 1-year-PD of a borrower. All concepts are demonstrated with a 
very up-to-date, real-life bank internal rating data set in detail. 

We focus mainly on statistical concepts for rating validation (backtesting) but it 
has to be emphasised that for a comprehensive and adequate validation in the spirit 
of Basel II, much more is required. To name a few, these include adherence of 
defined bank internal rating processes, accurate and meaningful use of ratings in the 
bank’s management systems and correct implementation in the IT-environment. 


Appendix A 


We show that the SPGH test statistic Zs is equal to the Zpin test statistic of the 
approximated binomial test in case where there is only one single PD. This is when 
all obligors are rated in the same rating grade g. We start with (15.19) and substitute 
iti = TM, respectively 7; = T because we argue under Ho: 


Ng Ng Ng 
y= Dirty DIEN T = 2 re) 
I= i= = 


VN: „(= 2n,)” ` Tg + (1 — 7g) 
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and get (15.14). 


Appendix B 


We want to derive the test statistic Zp of the Redelmeier test as it is shown in (15.22) 
according to Redelmeier et al (1991). We start with the MSE from module 1 as 


1 N 
MSEmı = N >. (yi = fim) = N 5 (9: —2 ‘Si Timi + ia) (15.24) 


Because of the randomness of the defaults the MSE will differ from its expected 
value 


i 1 
E(MSEm) = 5.) (i - tim)” = (ni 2- Ti Rim + Pent (15.25) 
i=l i=l 
The difference of the realized and the expected MSE for module 1 is 

Ami = E(MSEm1) — MSE 


N 
=; So (yi = 2+ yi im — Êi + 2 Ti Fim) (15.26) 
i=1 
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The same consideration has to be done for module 2: 


dm2 = E(MSEm2) Y MSEm2 


1 
=N XO (vi — 2+ yi Rima — Îi + 2+ Ti Rim) (15.27) 
j=l 


To determine whether two sets of judgments are equally realistic we compare the 
difference between dı and d2: 


dmi — dm = = 


N 
>| Rimi — Rim) (a i— yi) (15.28) 


zie 


As it can be seen from (15.28) the true but unknown PD 7; is still required and 
has therefore be assessed. A choice might be to set all 2; equal to the average of the 
corresponding judgments (fi m1,ftim2) (consensus forecast).*° This seems to be a 
reasonable choice since we presumed that each module itself has satisfied the null 
hypothesis of being compatible with the data. Using the consensus forecast 

Ti = 0.5. (imi FP ftim2) (15.29) 
we get 


dm Lis dm 


me (Fit — Rim) ` (0.5 - Tim — 9.5 - Ri m2 — yi) 
1 N 
=N m = Rea =2 ‘Yi (fim = iti m2) 
1 N 
==> | j—2- Yim +m) J = 2e yi: Rim + fone) 
1< i< 
— N 5 (yi B in) = DD — Îi, m2) 
i= 


i=1 


(15.30) 


c= MSE m1 = MSEm2 


It is interesting that in the case we use the consensus forecast for substituting 7; 
the term dı — dm2 is simply the difference of the two realized MSEs. 

In the next step we calculate the variance using the fact that the expected value of 
dni — dm2 is zero under the null hypothesis, see (15.23). 


45 . 
Other approaches are possible, e.g. one may get the “true” z;’s from an external source. 
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> N 
Var(dm — dm) = Var N >, (imi — fim) - (mi — yi) 


(15.31) 
4 A i, a 
~ N2 > (Fi mt E itim2) - m+ (1 — ni) 
i=l 
Finally we get the test statistic 
din a dm 
Ti 1 2 
V Var(dmı = dm2) 
N 
T = ia) E 2 (Fim = fi, m2) j yl (15.32) 
i=l i 
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Chapter 16 
Development of Stress Tests for Credit Portfolios 


Volker Matthias Gundlach 


16.1 Introduction 


Advanced portfolio models combined with naive reliance on statistics in credit risk 
estimations run the danger of underestimating latent risks and neglecting the peril 
arising from very rare, but not unrealistic risk constellations. The latter might be 
caused by abnormal economic conditions or dramatic events for the portfolio of 
a single credit institute or a complete market. This includes events of a political or 
economic nature. To limit the impact of such sudden incidents, the study of fictional 
perturbations and shock testing the robustness/vulnerability of risk characteristics 
is required. This procedure is known as stress testing. It allows the review and 
actualisation of risk strategies, risk capacities and capital allocation. Thus it can 
play an important role in risk controlling and management in a credit institute. 

This view is shared by the banking supervision, in particular by the Basel Com- 
mittee on Banking Supervision of the Bank for International Settlements (BIS). 
Consequently, stress testing for credit risk plays a role in the regulatory requirements 
of the Revised Framework on the International Convergence of Capital Measure- 
ments and Capital Standards (Basel II). Nevertheless, it has not reached the standards 
of stress testing for market risk estimations, which has been common practice for 
several years (see Breuer and Krenn 1999). 

In the following, we describe the purpose and signification of stress testing for 
credit risk evaluations. Then we recall the regulatory requirements, in particular of 
the Basel II framework. We describe how stress tests work and present some well- 
established forms of stress tests, a classification for them and suggestions how to 
deal with them. We also include examples for illustration. To conclude, we offer 
a concept for an evolutionary way towards a stress testing procedure. This is done 
in view of the applicability of the procedure in banks. 
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16.2 The Purpose of Stress Testing 


Stress testing means (regular) expeditions into an unknown, but important territory: 
the land of unexpected events and losses. It requires anticipating risks which could, 
but need not arise in the future and results in the determination of possible 
unexpected losses. As the latter are of immense relevance for financial institutions, 
there is growing interest in this topic. While it is already an intrinsic task to gain 
enthusiasm amongst the senior risk management for the rather theoretical values of 
unexpected losses, it is even more difficult to achieve acceptance for the quanti- 
tative output of stress tests. It makes sense to reduce such evaluations to (relative) 
comparisons of the unexpected losses in stress and normal situations. 

Moreover, there are various reasons for conducting stress testing due to the 
explicit or implicit relation between unexpected loss and economic capital or regu- 
latory capital, respectively. Crucial for the understanding of and the approach towards 
stress testing, is the definition of unexpected loss. Though it is clear that this quan- 
tity should be covered by economic capital, there is no general agreement as to how 
to define unexpected loss. 

It is quite common to regard the difference between expected loss and the value- 
at-risk (VaR) of a given confidence level, or the expected shortfall exceeding the 
VaR, as unexpected loss. One of the problems with this approach is that such an 
unexpected loss might not only be unexpected, but also quite unrealistic, as its 
definition is purely of a statistical nature. Therefore, it is sensible to use stress tests 
to underscore which losses amongst the unexpected are plausible or to use the 
outcome of stress tests, instead of unexpected loss to determine economic capital. 

Though the idea of using stress tests for estimating economic capital seems quite 
straight forward, it is only rarely realized, as it requires reliable occurrence prob- 
abilities for the stress events. With these, one could use the expected loss under 
stress as an economic capital requirement. Nevertheless, stress tests are mainly used 
to challenge the regulatory and economic capital requirements determined by 
unexpected loss calculations. This can be done as a simple test for the adequacy, 
but also to derive a capital buffer for extreme losses exceeding the unexpected 
losses, and to define the risk appetite of a bank. For new credit products like credit 
derivatives used for hedging against extreme losses it might be of particular 
importance to conduct stress tests on the evaluation and capital requirements. 

Using stress tests to evaluate capital requirements has the additional advantage 
of allowing the combination of different kind of risks; in particular market risk, 
credit and liquidity risk, but also operational risk and other risks such as reputa- 
tional risk. Because time horizons for market and credit risk transactions are 
different, and it is common for banks to use different confidence levels for the 
calculation of VaRs for credit and market risk (mainly due to the different time 
horizons), joint considerations of market and credit risk are difficult and seldom 
used. Realistic stress scenarios influencing various kinds of risk therefore could 
lead to extreme losses, which could be of enormous importance for controlling risk 
and should be reflected in the capital requirements. 
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In any case, there can be strong correlations between the developments of 
market, liquidity and credit risk which could result in extreme losses and should 
not be neglected. Consequently, investigations into events causing simultaneous 
increases in market and credit risk are more than reasonable. An overview over 
several types of risk relevant for stress testing can be found in Blaschke et al. 
(2001). 

The quantitative outcome of stress testing can be used in several places for 
portfolio and risk management: 


e Risk buffers can be determined and/or tested against extreme losses 

e The risk capacity of a financial institution can be determined and/or tested 
against extreme losses 

e Limits for sub-portfolios can be fixed to avoid given amounts of extreme losses 

e Risk policy, risk tolerance and risk appetite can be tested by visualising the risk/ 
return under abnormal market conditions 


Such approaches focusing on quantitative results might be of particular interest 
for sub-portfolios (like some country-portfolios), where the historic volatility of the 
respective loans is low, but drastic changes in risk relevant parameters cannot be 
excluded. 

Stress tests should not only be reduced to their purely quantitative features. They 
can and should also play a major role in the portfolio management of a bank, as they 
offer the possibility of testing the structure and robustness of a portfolio against 
perturbations and shocks. In particular they can represent a worthwhile tool to 


e Identify potential risks and locate the weak spots of a portfolio 

e Study effects of new intricate credit products 

e Guide discussion on unfavourable developments like crises and abnormal mar- 
ket conditions, which cannot be excluded 

e Help monitor important sub-portfolios exhibiting large exposures or extreme 
vulnerability to changes in the market 

e Derive some need for action to reduce the risk of extreme losses and hence 
economic capital, and mitigate the vulnerability to important risk relevant 
effects 

e Test the portfolio diversification by introducing additional (implicit) correlations 

e Question the bank’s attitude towards risk 


16.3 Regulatory Requirements 


As we have seen in the previous section, the benefits of using stress tests are 
manifold for the controlling and portfolio management. Tribute to this fact is also 
paid by the Basel II Revised Framework, see Basel Committee on Banking Super- 
vision (2004). Here stress testing appears in Pillar 1 (about the minimum capital 
requirements) and Pillar 2 (about the supervisory review process) for banks using 
the IRB approach. The target of the requirements is improved risk management. 


352 V.M. Gundlach 


The requirements in the Basel II Revised Framework are not precise. They can 
be summarized as’: 


e Task: Every IRB bank has to conduct sound, significant and meaningful stress 
testing to assess the capital adequacy in a reasonably conservative way. In parti- 
cular, major credit risk concentrations have to undergo periodic stress tests. 
Furthermore, stress tests should be integrated in the internal capital adequacy 
process, in particular, risk management strategies to respond to the outcome of 
stress testing. 

e Intention: Banks shall ensure that they dispose of enough capital to meet the 
regulatory capital requirements even in the case of stress. 

e Requirements: Banks should identify possible events and future changes in 
economic conditions, which could have disadvantageous effects on their credit 
exposure. Moreover, the ability of the bank to withstand these unfavourable 
impairments has to be assessed. 

e Design: A quantification of the impact on the parameters probability of default 
(PD), loss given default (LGD) and exposure at default (EAD) is required. 
Rating migrations should also be taken into account. 


Special notes on how to implement these requirements include: 


e The use of scenarios like: 
— Economic or industry downturn 
— Market-risk events 
— Liquidity shortage 
is recommended. 
e Recession scenarios should be considered, worst-case scenarios are not required. 
e Banks should use their own data for estimating rating migrations and integrate 
the insight of rating migrations in external ratings. 
e Banks should build their stress testing also on the study of the impact of smaller 
deterioration in the credit environment. 


Though the requirements for stress testing are mainly contained in Pillar 1 of 
Basel II, the method is a fundamental part of Pillar 2, since it is an important way of 
assessing capital adequacy. This explains the lack of extensive regulations for stress 
testing in that document as Pillar 2 acknowledges the ability to judge risk and use 
the right means for this procedure. As another consequence, not only regulatory 
capital should be the focus of stress tests, but also economic capital as the counter- 
part of the portfolio risk as seen by the bank. 

Not only the BIS (see CGFS 2000, 2001 and 2005) promotes stress testing, but 
also some central banks and regulators” have taken care of this topic (e.g., Deutsche 
Bundesbank 2003 and 2004; Fender et al. 2001), in particular regarding the stability 


'The exact formulations can be found in §434-§437, §765, §775 and §777 of BIS (2004). 


Regulators are also interested in contagion, i.e. the transmission of shocks in the financial system. 
This topic is not part of this contribution. 
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of financial systems. They have published statements which can be regarded as 
supplements to the Basel II Revised Framework. These publications give a better 
impression of the regulatory goals and basic conditions for stress testing, which can 
be summarized as: 


e Stress tests should consider extreme deviations from normal developments and 
hence should invoke unrealistic, but still plausible situations, i.e. situations with 
low probability of occurrence. 

e Stress tests should also consider constellations which might occur in future and 
which have not yet been observed. 

e Financial institutions should also use stress testing to become aware of their risk 
profile and to challenge their business plans, target portfolios, risk politics, etc. 

e Stress testing should not only be addressed to check the capital adequacy, but 
also used to determine and question limits for awarding credit. 

e Stress testing should not be treated only as an amendment to the VaR-evalua- 
tions for credit portfolios, but as a complimentary method, which contrasts the 
purely statistical approach of VaR-methods by including causally determined 
considerations for unexpected losses. In particular, it can be used to specify 
extreme losses in a qualitative and quantitative way. 


16.4 Risk Parameters for Stress Testing 


The central point of the procedure of stress testing — also seen in Basel II — is the 
change in risk parameters. For regulatory capital, these parameters are given by the 
probability of default (PD), loss given default (LGD) and exposure at default (EAD). 
In this connection, a superior role is in most cases played by the variations of PD, as 
LGD and EAD are lasting quantities which — due to their definition — should already 
be conditioned to disadvantageous situations, namely the default of the obligor. The 
possibilities of stress effects are hence restricted, especially for EAD. The latter 
might be worsened by a few exogenous factors such as the exchange rate, but they 
should also be partly considered in the usual EAD. The exogenous factors affecting 
the EAD might only be of interest if they also have an impact on the other risk 
parameters and hence could lead to an accumulation of risky influence. 

The possible variances for the LGD depend heavily on the procedure used to 
determine this quantity. Thus, deviations which might arise from the estimation 
methods, should be determined, as well as parts of the process that might depend on 
economic conditions. As the determination of the LGD is conditioned — by definition — 
to the unfavourable situation of a default, it should take into account lasting values 
for collaterals, and lead to values that can be seen as conservative. Thus, there 
should not too many factors be left, that could lead to extreme changes for the LGD. 
Mainly the evaluation of collateral could have some influence which cannot be 
neglected when stressing the LGD. In particular, it might be possible that factors 
affecting the value of the collaterals also have an impact on other risk parameters 
and hence should be taken into account. 
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For stressing derivative products like credit default swaps (CDS), credit default 
obligations (CDOs) and CDO? it might make sense to investigate the effects on the 
LGD. Very often these products contain leverage effects or are opposed to system- 
atic risk. These phenomena could be observed for example in the subprime crisis, 
when the burst of the real estate bubble in the US had enormous effects on the value 
of houses and hence on the LGDs of corresponding credits, leading to even higher 
downgrades of LGDs for respective CDSs and CDOs. This might indicate how 
complex the evaluation of LGDs can be. 

The PD is by far the most popular risk parameter which is varied in stress tests. 
There are two main reasons why variations in the PD of an obligor can occur. On 
the one hand, the assignment of an obligor to a rating class might change due to 
altered inputs for the rating process. On the other hand, the realised default rates of 
the rating classes itself might change, e.g., because of modified economic condi- 
tions and their impact on the performance of the loans. This allows two options for 
the design of the integration of PDs into stress testing: modifications either of the 
assignment to rating classes or of the PDs of the rating classes for stress tests. 

Altered assignments of rating classes for obligors in stress tests have the advan- 
tage that they also allow the inclusion of transitions to non-performing loans. The 
change of PDs corresponds to a change of rating class. The possible deviation in 
the assignment of rating classes can be promoted by the rating procedure. Thus, the 
possibilities of variances and the sensitivity of the input for the rating process should 
be investigated in order to get a first estimate for possible deviations. Consequently, 
as well as the analysis of historic data for rating transitions, expert opinions on the 
rating methodology should be a part of the design process for the stress test. 

The modification of PDs for the rating classes, could have its origin in systematic 
risk, i.e. in the dependence on risk drivers, one of the main topics in designing stress 
tests, as will be discussed below. While it is sensible to estimate the volatility of 
PDs in a first step and use the outcome of this procedure for tests on regulatory 
capital, the differentiation of the effects of systematic and idiosyncratic risk on PD 
deviations should be considered in a second step. This will lead to more advanced 
and realistic stress tests, in particular on economic capital. 

An analysis of the transition structure for rating classes might also be used to 
determine PDs under stress conditions. The advantage of modifying PDs against 
modifying the assignment of rating classes is a greater variety for the choices of 
changes; the disadvantage is the absence of a modified assignment to the 
performing and non-performing portfolio. This has to take place on top of the 
modification of PDs. 

Estimating economic capital PD, LGD and EAD might not be sufficient to 
design stress tests. In addition, parameters used for displaying portfolio effects, 
including correlations between the loans or the common dependence on risk drivers 
are needed.° Investigations on historic crises for credit risk show that correlations 


>The basis for widely used portfolio models like CreditRisk+ or CreditMetrics, which are used by 
banks for estimating the VaR, are provided by factor models. The (abstract) factors are used to 
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and risk concentration exhibit huge deviations in these circumstances. In any case, 
their variations should be considered in stress tests with portfolio models if possi- 
ble. Some advanced models for estimating economic capital might even require 
more information, in particular economic conditions. 

Portfolio models such as CreditMetrics not only consider the default of loans, 
but also the change of value by using migration probabilities. In this case, the 
migration probabilities should be stressed in the same way as PDs. 

Stressing of risk parameters in tests need not take place for the whole portfolio, 
but only for parts of it. Also, the strength of the parameter modification might 
depend on sub-portfolios and credit products. Such approaches are used to pay 
tribute to different sensitivities of parts of the portfolio to risk relevant influences or 
to study the vulnerability of certain (important) sub-portfolios. They can be partic- 
ularly interesting for investigations on economic capital with the help of portfolio 
models. In these cases, parameter changes for parts of the portfolio need not have a 
smaller impact than analogous variations for the whole portfolio due to effects of 
concentration risk or diversification, respectively. 


16.5 Evaluating Stress Tests 


As stress testing should be a part of the internal capital adequacy process, there 
should be an understanding of how to use the outcome of stress tests for controlling 
and managing portfolio risk. The starting point for this should be the regulatory and 
economic capital as output of the underlying stress tests. The first task consists of 
checking whether the financial institution holds sufficient capital to also cover the 
requirements in the stress situation. As there should be limits, buffers and policies 
to guarantee this, the evaluation of stress testing should be also used to review these 
tools. Since the latter might be applicable to different portfolio levels (e.g. limits for 
sub-portfolios, countries, or obligors) they should be checked in detail. 

The concept of stress testing would be incomplete without knowing when action 
has to be considered as a result of the outcome of tests. It makes sense to introduce 
indicators and thresholds for suggesting when 


e To inform management about potential critical developments 

e To develop guidelines for new business in order to avoid the extension of 
existing risky constellations 

e To reduce risk for the portfolio or sub-portfolios with the help of securitisation 
and syndication 

e To readjust an existing limit management system and the capital buffer for credit 
risk 

e To re-think the risk policy and risk tolerance 


present systematic risk affecting the loans. In these models it makes sense to stress the strength of 
the dependence on the factors and the factors themselves. 
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Indicators for the call on action could be 


e The increase of risk indicators as expected loss, unexpected loss, expected 
shortfall over a threshold or by a specified factor 

e The increase of capital requirements (regulatory or economic) over a threshold 
or by a specified factor 

e The solvency ratio of capital and capital requirements under a threshold 

e A low solvency level for meeting the economic capital requirements under stress 

e A specified quantile of the loss distribution for the portfolio under stress condi- 
tions does not lie within a specified quantile of the loss distribution for the 
original portfolio 

e Expected loss for the portfolio under stress conditions overlaps the standard risk 
costs (calculated on the basis of expected loss for the duration of the loans) by a 
specified factor or gets too close to the unexpected loss for the unstressed 
portfolio 

e The risk/return lies above a specified threshold, where risk is measured in terms 
of unexpected loss 


The interpretation of the outcome of stress tests on economic capital can 
easily lead to misapprehensions, in particular if the capital requirement is 
estimated on the basis of a VaR for a rather large confidence level. The motiva- 
tion for the latter approach is the avoidance of insolvency by holding enough 
capital, except for some very rare events. Stress tests might simulate situations 
coming quite close to these rare events. Adhering to the large confidence levels 
for estimating economic capital, offers the possibility of comparing the capital 
requirements under different conditions, but the resulting VaR or economic 
capital should not be used to question the solvency. In fact, it should be 
considered whether to use adapted confidence levels for stress testing or to 
rethink the appropriateness of high confidence levels. One can see the probabil- 
ity of occurrence or the plausibility of a stress test as a related problem. We refer 
to a detailed discussion on this topic and an approach to resolution to Breuer and 
Krenn (2001). 


16.6 Classifying Stress Tests 


According to regulatory requirements, a bank should perform stress tests on its 
regulatory as well as its economic capital. This differentiation of stress tests is not 
essential and mainly technical, as the input for determining these two forms of 
capital might be quite different as described in the previous section. 

Another technical reason for differentiating stress tests is the division into 
performing and non-performing loans, as their respective capital requirements 
follow different rules. For non-performing loans, loss provisions have to be made. 
Thus one has to consider the following cases for stress tests: 
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e A performing loan gets downgraded but remains a performing loan — the 
estimation of economic capital involves updated risk parameters. 

e A performing loan gets downgraded and becomes a non-performing loan — 
provisions have to be estimated involving the net exposures calculated with 
the LGD. 

e A non-performing loan deteriorates — the provisions have to be increased on the 
basis of a declined LGD. 


As already discussed in the previous section, defaults can be included in stress 
tests via a worsened assignment to rating classes. If stress tests focus on PDs rather 
than rating classes, then stress rates for the transition of performing to non- 
performing loans are required for the same purpose. Ideally, they depend on ratings, 
branches, economic states, etc. and are applied to the portfolio after stressing the 
PDs. Moreover, the methodology of a bank to determine the volume of the 
provision for a defaulted credit should be considered. A typical approach is to 
equate the loss amount given the default (i.e. the product of LGD with the exposure) 
with the provision. 

Typical ways to categorize stress tests can be taken over from market risk. They 
are well documented in the literature (CGFS 2005 and Deutsche Bundesbank 2003 
and 2004). The most important way to classify stress tests is via the methodology. 
One can distinguish stress tests with respect to techniques in statistically and model 
based methods, and with respect to conceptual design in sensitivity analysis and 
scenario analysis. While the latter is based on modelling economic variances, 
sensitivity analysis is statistically founded. The common basis for all these speci- 
fications is the elementary requirement for stress tests to perturb the risk para- 
meters. These can be the basic risk parameters (EAD, LGD, PD), of the loans as 
already mentioned for the tests on the regulatory capital. However, these can also be 
parameters used in a portfolio model like asset correlations or dependencies on 
systematic risk drivers. 

The easiest way to perform stress tests is a direct modification of the risk 
parameters and belongs to the class of sensitivity analysis. The goal is to study 
the impact of major changes in the parameters on the portfolio values. For this 
method, one or more risk parameters are increased (simultaneously) and the 
evaluations are made for this new constellation. The increase of parameters should 
depend on statistical analysis or/and expert opinion. As these stress tests are not 
linked to any event or context and are executed for all loans of a (sub-) portfolio, 
without respect to individual properties, we refer to them as flat or uniform stress 
tests. Most popular are the flat stress tests for PDs, where the increase of the default 
rates can be derived from transition rates between the rating classes. An advantage 
of these tests is the possibility of performing them simultaneously at different 
financial institutions and aggregating these results to check the financial stability 
of a system. This is done by several central banks. Such tests are suited to checking 
the space and buffer for capital requirements, but it does not mean any help for 
portfolio and risk management. 
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Model based methods for stress testing incorporate observable risk drivers, in 
particular, macroeconomic variables for representing the changes of risk parameters. 
In the following, we will refer to these risk drivers as risk factors. The respective 
methods rely on the existence of a model — mainly based on econometrical methods — 
that explains the variations of the risk parameters by changes of such risk factors. One 
can distinguish univariate stress tests, which are defined by the use of a single, 
isolated risk factor, and multivariate stress tests, where several factors are changed 
simultaneously. These tests can be seen as a refinement of those previously 
described: stressing the risk factors leads to modified risk parameters which are 
finally used for the evaluation of the capital requirements. Note that risk factors can 
have quite different effects on risk parameters throughout a portfolio. Changes in the 
risk factors can lead to upgrades as well as downgrades of risk parameters. For 
example, an increase in price of resources such as oil or energy can have a negative 
impact on PDs in the automobile or any other industry consuming lots of energy, but 
it could have a positive impact on the PDs in the country trading these resources. 

By using univariate stress tests, banks can study specific and especially relevant 
impacts on their portfolios. This has the benefit of isolating the influence of an 
important observable quantity. Consequently, it can be used to identify weak spots 
in the portfolio structure. Thus, univariate stress tests represent another kind of 
sensitivity analysis, now in terms of risk factors instead of risk parameters. They 
have the disadvantage of possibly leading to an underestimation of risk by neglect- 
ing potential effects resulting from possible correlations of risk factors. 

This shortcoming is abolished by using multivariate stress tests. The price is the 
reliance on additional statistical analysis, assumptions or the establishment of 
another model describing the correlation of the risk factors involved. This is done 
in a framework known as scenario analysis, where hypothetical, historical and 
statistically determined scenarios are distinguished. It results in the determination 
of stress values for the risk factors which are used to evaluate stress values for the 
risk parameters. 

With respect to the design of scenarios, we can discriminate approaches driven 
by the portfolio (bottom-up approaches) and driven by events (top-down 
approaches). Bottom-up approaches tend to use the results of sensitivity analysis 
to identify sensitive dependence on risk factors as starting points. As a conse- 
quence, those scenarios are chosen which involve risk factors having the largest 
impact. For example, for a bank focusing on real estate, GDP, employment rate, 
inflation rate, spending capacity in the countries, it is acting in, will be of more 
relevance than the oil price, exchange rates, etc. Thus, it will look for scenarios 
involving the relevant risk factors. Top-down approaches start with a chosen 
scenario, e.g., the terror attack in New York on September 11, 2001, and require 
the analysis of the impact of this scenario on the portfolio. The task in this situation 
is to identify those tests which cause the most dramatic and relevant changes. 

Historical scenarios are typical examples of top-down approaches. They refer to 
extreme constellations of the risk factors which were observed in the past and in the 
majority of the cases can be related to historical events and crises. They are 
transferred to the current situation and portfolio. This can be seen as a disadvantage 
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of this approach, as the transferred values may no longer be realistic. Another 
drawback is that generally, it is not possible to specify the probability of the 
scenario occurring. 

Also, statistically determined scenarios might depend on historical data. They 
are based on the (joint) statistical distribution of risk factors. In this approach, 
scenarios might be specified by quantiles of such distributions. Whilst it might be 
very difficult to produce suitable distributions in particular, joint distributions, the 
advantage is that it is possible to evaluate the probability of the scenario occurring 
as this is given by the complement of the confidence level used for the quantile. The 
existence of such probabilities of occurrence allows the calculation of expected 
extreme losses which can be used for the estimation of economic capital. The 
crucial point of this approach is the generation of a suitable risk factor distribution. 
Only if the latter is chosen compatible with the state of economy, (hence does not 
rely too heavily on historic data), can useful conclusions for the management of the 
portfolio be derived. 

Finally, there are hypothetical scenarios which focus on possible rare events that 
might have an important impact on the portfolio, but have not been observed yet in 
the form they are considered. The crucial point is the presentation of the conse- 
quences of the event on the risk factors. For the estimation of this expert opinion, it 
is necessary to accompany the macro-economic modelling of the dependence of the 
risk parameters on risk factors. If macroeconomic parameters are not part of the 
input for determining the risk parameters which are stressed, there are three steps 
required for macro stress tests. Firstly, it is necessary to model the dependence of 
the risk parameters on the risk factors. Secondly, it is necessary to choose values for 
the risk factors which are representative for stress events. Since it is intended to 
reproduce correlations and causal interrelations between risk factors and stress 
events, intricate (macro-economic), methods of estimation and validation are 
needed. A disadvantage of hypothetical scenarios might be having to specify the 
probability of occurrence of such hypothetical scenarios. On the other hand, there is 
the major advantage of having forward-looking scenarios which do not necessarily 
reflect historical events. Thus, hypothetical scenarios present interesting adjuncts to 
VaR-based analysis of portfolio risk and are a worthwhile tool for portfolio 
management. 

The use of risk factors as in the multivariate scenario analysis has the additional 
advantage of allowing common stress tests for credit, market and liquidity risk. 
Here, it is necessary to consider factors that influence several forms of risk or 
scenarios that involve risk factors for them. 

Hypothetical scenarios can also be produced on the basis of expert opinions. 
Though this approach might have the disadvantage of being mathematically/statis- 
tically not as precise as the one based on macro-economic modelling, it in fact can 
have the advantage of understanding the risk profiles of a portfolio. For this it is 
important to discuss with experts step by step all the possible effects a scenario 
might have. If this is done with all details, a perfect risk profile and a good insight in 
portfolio risk can be gained. It is also possible to go an even longer way: one can 
start with a so-called risk map describing all potential general risks (e.g. classified 


360 V.M. Gundlach 


with respect to their nature like catastrophes, war and terror, loss of financial 
stability, etc.) and their main effects. Having identified the main general risks for 
the portfolio it is possible to use a so-called risk monitor to zoom into these risks 
and identify the effects on the portfolio in more detail. Further analysis with experts 
can then result into the determination of hypothetical scenarios. 


16.7 Conducting Stress Tests 


In the following section we will discuss how the stress tests we have just introduced 
in the previous section, can be and are, applied in financial institutions. We try to 
provide details how to determine and conduct stress tests, focussing mainly on the 
performing part of credit portfolios. 


16.7.1 Uniform Stress Tests 


The most popular stress tests in banks are uniform stress tests, in particular for the 
PDs. The intention is to use increased PDs for the calculation of economic or 
regulatory capital. In the easiest case, there is a flat increase rate for all PDs* of 
obligors or/and countries, but in general, the change might depend on rating classes, 
branches, countries, regions, etc. We suggest several ways to derive the stress PDs: 


1. Analyse the default data with respect to the dependence on rating classes, 
branches, countries, regions, etc. This data could originate from the bank’s 
own portfolio or from rating agencies. Determine the deviations of the default 
rates from the PD. Another way to derive such variations might arise from the 
analysis of spreads for respective credit derivatives. The stress PD then can be 
determined from the PD by adding the standard deviation, a quantile or other 
relevant characteristic of the deviation distribution. It might seem to be a good 
idea to use the quantile to determine also a probability of the stress occurring, 
but one should question the quality and the relevance of the distribution before 
using this approach. 

2. Use migration rates (referring to the bank’s own portfolio or coming from rating 
agencies), to determine transitions between rating classes. These transitions 
might depend on branches, countries, etc. In an intermediate step, stressed 


4Such stress tests are often used by central banks to test the stability of the respective financial 
systems. In the studies in Deutsche Bundesbank (2003) PDs are increased by 30% and 60%, 
respectively. These changes approximately correspond to downgrades of Standard and Poor’s’ 
ratings by one or two classes, respectively. The latter is seen as conservative in that paper. Banks 
should analyse their default data to come up with their own rates of increase, which we expect to be 
in the worst case larger than 60%. 
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migration matrices can be generated by omitting rating upgrades, by condition- 
ing on economic downturns (Bangia et al. (2002)), by uniformly increasing the 
downgrade rates at the expense of uniformly decreasing the upgrade rates or on 
the basis of a time series analysis. Next, one can derive for every original rating 
class, a stressed rating class by evaluating quantiles or any other characteristics 
for the transition probabilities. Consequently, it is possible to build the stress 
test on the rating classes. Now, the stress test consists of replacing the original 
rating class by the stressed rating class. Alternatively, one can replace the 
original PD by the PD of the stressed rating class. A different approach uses 
the stressed migration rates. Depending on their derivation, they possibly have 
to be calibrated to become transition probabilities. Then they can be used to 
calculate an expected PD for every rating class, which can play the role of a 
stressed PD. 


The decision as to which option should be chosen for determining the stress PD 
should depend on the data, which is available for statistical analysis. Also, expert 
opinions could be a part of the process to generate the stress PDs. In particular, it 
makes sense to study the deviations that can be caused by the rating process due to 
sensitive dependence on input parameters. This could lead to an additional add-on 
when generating the stress PDs. 

The preference for stressed PDs or stressed rating classes should depend on the 
possibilities of realising the stress tests. Regarding the portfolio model, the depen- 
dence of a PD on a branch or country in a rating class could — for example — 
represent a problem. A criterion in favour of stressed rating classes is the inclusion 
of defaults. Such a stressing might lead to assignments of loans to classes belonging 
to the non-performing portfolio. These can be treated respectively, i.e. instead of 
the capital requirements, provisions can be calculated. In the case that PDs are 
stressed, instead of rating classes, one should first consider the stressing of the PDs 
in the portfolio and then the stressing of transition rates to the non-performing part 
of the portfolio. In this context, Monte Carlo simulations can be used to estimate 
capital requirements for the performing, and provisions for the non-performing part 
of the portfolio. 

Transition rates to the non-performing portfolio, usually corresponding to 
default rates, can be stressed in the same form and with the same methods as the 
PDs. The same holds for migration rates between rating classes which are used in 
some portfolio models. 

Flat stress tests for LGDs could also be based on statistical analysis, in this case 
for loss data. The approach to determine and study deviations in loss rates is 
analogous to the one for default rates. Expert opinion could play a bigger role. 
An example of an interesting stress test could be provided by a significant fall in 
real estate prices in some markets. 

Uniform stressing of EAD is often not relevant. Deviations of this quantity 
mainly depend on individual properties of the loans. Variations of exchange rates 
can be seen as the most important influence on the deviations of EAD from the 
expected values. It is commendable to investigate this effect separately. 
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For uniform stressing of parameters used in portfolio models, it seems to be the 
best to rely on expert opinions, as it is very difficult to detect and statistically verify, 
the effect of these parameters on the deviations from expected or predicted values 
of defaults and losses. 

While it is already rather intrinsic to determine suitable parameter values for the 
uniform tests involving single parameters, it even becomes more difficult to do this 
for several parameters at the same time. Experience derived from historic observa- 
tions and expert opinion seems to be indispensable in this situation. 


16.7.2 Sensitivity Analysis for Risk Factors 


This kind of stress testing is very popular for market risk, where risk factors can 
easily be identified, but it can also be seen as basic for scenario analysis. This is due 
to the crucial task of recognising suitable risk factors and introducing a valid 
macroeconomic model for the dependence of risk parameters on the risk factors 
representing the state of the business cycle. Of course, there are obvious candidates 
for risk factors like interest rates, inflation rates, stock market indices, credit 
spreads, exchange rates, annual growth in GDP, oil price, etc. (Kalirai and Scheicher 
(2002)). Others might depend on the portfolio of the financial institute and should 
be evident for good risk managers. Using time series for the risk factors on relevant 
markets, as well as for the deviations of risk parameters and standard methods of 
statistical analysis like discriminant analysis, one should try to develop a macro- 
economic model and determine those factors suitable to describe the evolution of 
risk parameters. Typically, the impact of stress on the risk parameters or directly on 
credit loss characteristics is modelled using linear regression. One of the problems 
involves determining the extent to which the risk factors must be restricted, whilst 
allowing a feasible model. 

Discovering which risk factors have the biggest impact on the portfolio risk in 
terms of the VaR or whatever is used for the evaluation of unexpected losses, is 
the target and the benefit of sensitivity analysis. Stressing is analogous to the 
uniform stress test on risk parameters. Stress values for a single risk factor are 
fixed on the basis of statistical analysis or expert opinion. The consequences for 
the risk parameters are calculated with the help of the macroeconomic model and 
the modified values for the risk parameters are finally used for evaluating capital 
requirements. Risk factors which have an impact on several risk parameters and 
which also play a role for stress testing market risk, might be of particular 
interest. 

Sensitivity analysis could also be used to verify the uniform stress testing by 
checking whether the range of parameter changes due to sensitivity analysis is 
also covered by the flat stress tests. Moreover, it can be seen as a way to pre-select 
scenarios: only those historical or hypothetical scenarios which involve risk 
factors showing some essential effects in the sensitivity analysis are worth 
considering. 
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16.7.3 Scenario Analysis 


Having specified the relevant risk factors, one can launch historic scenarios, 
statistical selection of scenarios and hypothetical scenarios. These different methods 
should partly be seen as complementing each other. They can also be used for 
specifying, supporting and accentuating the other. 


16.7.3.1 Historical Scenarios 


Historical scenarios are easy to implement, as one only has to transfer the values of 
risk factors corresponding to a historic event to the current situation. In most cases, 
it does not make sense to copy the value of the risk factors, but to determine the 
change of value (either in absolute or in relative form) which is accompanied by the 
insertion of the event and assume it also applies to the actual evaluation. 

The following events are most popular for historical scenarios: 


e Oil crisis 1973/1974 

e Stock market crash (Black Monday 1987, global bond price crash 1994, Asia 
1998) 

e Terrorist attacks (New York 9/11 2001, Madrid 2004) or wars (Gulf war 1990/ 
1991, Iraq war 2003) 

e Currency crisis (Asian 1997, European Exchange Rate Mechanism crisis 1992, 
Mexican Peso crisis 1994) 

e Emerging market crisis 

e Failure of LTCM? and/or Russian default (1998) 


Though the implications of historical scenario analysis for risk management 
might be restricted due its backward looking approach, there are good reasons to 
use it. First of all, there are interesting historic scenarios which certainly would not 
have been considered, as they happened by accident, i.e. the probability of occur- 
rence would have been seen too low to look at them. Examples of this case are 
provided by the coincidence of the failure of LTCM and the Russian default or the 
1994 global bond price crash. It can be assumed that both events would rarely have 
contributed to the VaR at the time of their occurrence, due to the extremely low 
probability of joint occurrence for the single incidents.° 


The hedge fund Long-Term Capital Management (LTCM) with huge, but well diversified risk 
positions was affected in 1998 by a market-wide uprising of risk boosted by the Russia crisis. This 
led to large losses of equity value. Only a joint cooperation of several US-investment banks under 
the guidance of the Federal Reserve could avoid the complete default of the fund and a systemic 
crisis in the world’s financial system. 

The movements of government bond yields in the US, Europe and Japan are usually seen as 
uncorrelated. Hence their joint upward movement in 1994 can be seen as an extremely unlikely 
event. 
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There is also much to learn about stress testing and scenario analysis from 
working with historic scenarios. On the one hand, the latter can be used to check 
the validity of the uniform stress tests and sensitivity analysis, on the other hand, 
they can be very helpful in designing hypothetical scenarios. Thus, the analysis of 
historical scenarios offers the unique possibility of learning about the joint occur- 
rence of major changes to different risk factors and the interaction of several types 
of risks, e.g., the impact of credit risk events on liquidity risk. For these reasons, we 
regard historical scenario analysis as a worthwhile part of establishing a stress 
testing framework, but not necessarily as an essential part of managing and 
controlling risk. 


16.7.3.2 Statistically Determined Scenarios 


A special role is played by the analysis of scenarios which are chosen on the basis 
of risk factor distributions. These are not directly related to the other types of 
scenario analysis. Central to this approach is the use of (joint) risk factor distribu- 
tions. While it should not be too difficult for isolated common risk to generate such 
distributions on the basis of historic data, a situation involving several factors can 
be far more intricate. Nevertheless, distributions generated from historic data 
might not be sufficient. It would be much better to use distributions conditioned 
to the situation applying at the time of stress testing. This could represent a real 
problem. 

We would like to point out that only in the case of a reliable factor distribution, 
should this approach be used. If expected losses conditioned to a quantile are 
evaluated in order to interpret them as unexpected losses and treat them as eco- 
nomical capital requirement, then the risk factor distribution should also be 
conditioned to the given (economic) situation. 


16.7.3.3 Hypothetical Scenarios 


Hypothetical scenario analysis is the most advanced means of stress testing in risk 
management. It should combine experience in analysing risk relevant events with 
expert opinion on the portfolio, as well as the economic conditions and statistical 
competency. The implementation of hypothetical scenario analysis is analogous to 
the one for historic scenarios. The only difference is provided by the choice of 
values for the risk factors. This can be based on or derived from historical data, but 
expert opinion might also be used to fix relevant values. 

The choice of scenarios should reflect the focus of the portfolio for which the 
stress test is conducted and should have the most vulnerable parts of it as the target. 
Common scenarios (together with risk factors involved) are provided by the 
following: 
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e Significant rise in oil price (increased oil price, reduced annual growth in GDP to 
describe weakened economic growth, indices describing increased consumer 
prices, etc.) 

e Major increase of interest rates (indices describing the volatility of financial 
markets, increased spreads, reduced annual growth in GDP to describe weak- 
ened economic growth, volatility of exchange rates, consumer indices, etc.) 

e Drop in global demand (reduced annual growth in GDP, stock market indices, 
consumer indices, etc.) 

e Emerging market crisis (reduced annual growth in GDP to describe weakened 
economic growth, widened sovereign credit spreads, decline in stock prices, 
etc.) 

e Burst of economic bubbles like the ones on the real estate markets in the US, UK 
or Spain in 2007/2008 (reduced annual growth in GDP, drop in exchange rate, 
widened sovereign credit spreads, reduced consumer rates, etc.) 


Hypothetical scenarios have the additional advantage that they can take into 
account recent developments, events, news and prospects. Note that scenarios 
involving market parameters like interest rates are well suited for combinations 
with stress tests on market and liquidity risk. 


16.8 Examples 


In the following we will present the outcome of some stress tests on a virtual 
portfolio to illustrate the possible phenomena, the range of applications and advan- 
tages corresponding to the tests. The portfolio consists of 10,000 loans and exhibits 
a volume of 159 billion EUR. The loans are normally distributed over 18 rating 
classes (PDs between 0.03% and 20% and a mean of 0.6%) and LGDs (ranging 
from 5 to 50% with a mean of 24%). Moreover, they are gamma-distributed with 
respect to exposure size (ranging from 2.000 EUR to 100 million EUR with mean 1 
million EUR). 

To determine economic capital, we employ the well known portfolio model 
CreditRiskt (Gundlach and Lehrbass 2004). We use it here as a six-factor-model, 
this means that we incorporate six (abstract) factors corresponding to so-called 
sectors (real estate, transport, energy, resources, airplanes, manufacturing) which 
represent systematic risk drivers. For our version of CreditRisk*, each obligor j is 
assigned exactly to one sector k = k(j). This is done according to a weight w,, 
0 < w; < 1. For each sector k there is a corresponding random risk factor S;, which 
is used to modify the PD p; to p; via 


Pj = PiwjSkij)- (16.1) 


The random factors S have mean / and are gamma-distributed with one para- 
meter cz corresponding to the variance of the distribution. Correlations in CreditRisk* 
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are thus introduced via the S+, i.e. in our CreditRisk*-version, only correlations 
between obligors from the same sector are sustained. The strength of the correlations 
depends on the weights w; and the variation øg. These parameters can both be 
modified in stress tests, though it seems more natural to increase the o;’s. 

The loans in the portfolio are randomly distributed over the six sectors, repre- 
senting systematic risk, and 13 countries, which play a role in some of the scenarios. 
The dependence of the loans on respective systematic risk factors varies between 25 
and 75% and is randomly distributed in each sector. The sectorial variation para- 
meters o,’s are calculated from the volatilities of the PDs according to some 
suggestion from the original version of CreditRisk* and range between 1.8 and 2.6. 

In the stress tests we only take account of the dependence of the risk parameter 
PD, on risk factors /;. When modelling this interrelation, we used a simple linear 
regression to predict the changes of rating agencies’ default rates for the sector and 
country division of the portfolio and transferred this dependence to the PDs p; used 
in our model 


Di = >_ xii + Hj. (16.2) 


Here the u,’s represent residual variables and the indices refer to a classification 
of PDs according to sectors and countries. Due to the small amount of data and the 
crude portfolio division, we ended with a rather simple model for the PDs with 
respect to their assignment to sectors and countries involving only an oil price 
index, the S&P 500-Index, the EURIBOR interest rate, the EUR/USD exchange 
rate and the GDP of the USA and EU. 

We performed several stress tests on the virtual portfolio. The evaluation of 
these tests takes place in terms of expected loss, regulatory and economic capital. 
For the latter, we calculate the unexpected loss as the difference between VaR for a 
confidence level of 99.99% and expected loss. We focus on the outcome for the 
whole portfolio, but also report on interesting phenomena for sub-portfolios. The 
calculations of regulatory capital are based on the Basel II IRBA approach for 
corporates, while the estimations of VaR are done with CreditRisk*. Loss provi- 
sions are also considered in some tests. In the case that the assignment of obligors to 
rating classes is stressed, non-performing loans and hence candidates for loan 
provisions are implicitly given. In other cases, they are determined for each rating 
class according to a stressed PD. The volume of the respective portfolio is reduced 
respectively. 

We have considered the following stress tests, including uniform stress tests, 
sensitivity analysis, historical and hypothetical scenario analysis: 


1. Flat increase of all PDs by a rate of 50%, (a) with and (b) without loan loss 
provisions 

2. Flat increase of all PDs by a rate of 100% (a) with and (b) without loss 
provisions 

3. Uniform upgrade of all rating classes by one 

4. Flat increase of all LGDs by 5% 


16 


fon 


11. 
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. Flat increase of all PDs by a rate of 50% and all LGDs by 5% 
. Flat increase of all sectorial variances Ox by a rate of 50% 
. Flat increase of all LGDs by 10% for real estates in UK and USA (burst of real 


estate bubble) 


. Drop of stock market index (S&P500-Index) by 25% 
. Rise of oil price by 40% 
10. 


September 11 (drop of oil price by 25%, of S&P-Index by 5.5%, EURIBOR by 
25%) 

Recession USA (drop of S&P-Index by 10%, GDP of USA by 5%, GDP of EU 
by 2%, increase of EUR/USD-exchange rate by 20%) 


The outcome is summarised in the following table where all listed values are in 


million EUR (Table 16.1): 


The inclusion of loss provisions does not seem to play a major role in the overall 


outcome of stress testing, as the sum of the provisions and the economic capital is 
rather small. Nevertheless, the discrimination of economic capital and provisions 
(in particular with the comparison of the latter with expected loss), is quite 
interesting. Also, the distinction between stressing PDs and stressing the 


Table 16.1 Outcome of stress testing on a virtual portfolio 


10 


11 


. Stress test Regulatory Economic Expected Loss Sectorial increase 
capital capital loss provision of economic capital 
None (Basis 3,041 1,650 235 0 
portfolio) 
PD’ 150% 3,715 2,458 353 0 
PD'150% with 3,631 2.255 320 332 
provisions 
PD*200% 4,238 3,267 470 0 
PD'200% with 4,151 2,996 427 332 
provisions 
Rating class + 1 3,451 1,911 273 376 
LGD + 5% 3,676 1,985 283 0 
LGD + 5%, 4,490 3,935 567 0 
PD" 150% 
Systematic 3,041 3,041 235 0 
factor 150% 
Real estate 3,106 1,686 240 0 32% for real estates, 
bubble 45% for UK and 
USA 
Stock price 3,591 2,368 329 0 58% for USA, Western 
decline Europe, Japan 
Rise of oil price 3,430 2,057 300 0 65% for transport and 
airplanes 
Terror attack 3,897 2,622 399 0 77% for USA, Western 
New York Europe, Japan 
September 11 
Recession USA 3,688 2,307 351 0 68% for USA and 


South America, 
57% for airplanes 
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assignment to rating classes has a rather limited impact on the result of the stress 
testing. Furthermore, it is not a surprise that stress testing has a larger impact on 
economic capital than on regulatory capital. 

The significant diversity of impact on the sectors and countries by the scenario 
analysis underscores the importance of this kind of stress testing for detecting weak 
spots in the portfolio and for portfolio management. As the portfolio used here is 
rather well diversified, the effects would be larger in a real portfolio. Also, the 
simultaneous stressing of several risk parameters has major effects. This is under- 
lined by the joint increase of PDs and LGDs. Also, the role of parameters describing 
systematic risk cannot be overestimated, as is indicated by the test given by the 
increase of systematic risk factors. Some of the scenarios lack the exhibition effects 
one would expect (like a major deterioration of airplane industry in the historic 
scenario concerning the terrorist attacks of September 11), which could not be 
indicated by the linear regression, but which could be produced in the design of the 
stress test using expert opinion. 


16.9 Conclusion 


Stress testing is a management tool for estimating the impact on a portfolio of a 
specific event, an economic constellation or a drastic change in risk relevant input 
parameters, which are exceptional, even abnormal, but plausible and can cause 
large losses. It can be seen as an amendment as well as a complement to VaR-based 
evaluations of risk. It allows the combinations of statistical analysis and expert 
opinions for generating relevant and useful predictions on the limits for unexpected 
losses. 

Stress testing should not only be seen as a risk management method — though it 
can be used in various ways, but also as an means towards analysing risk and risk 
relevant constellations. In particular, it should lead to a higher awareness and 
sensitivity towards risk. This requires a better knowledge of risk drivers, portfolio 
structure and the development of risk concentrations. It cannot be achieved in a 
standard way. Instead experience, research and sustained investigations are 
required. In particular it makes sense to use an evolutionary approach to overcome 
the complexity of requirements for stress testing. 

We would like to make the following suggestion as an evolutionary way towards 
a reasonable and feasible framework of stress testing. The basis of stress tests is 
provided by rich data for defaults, rating transitions and losses. The starting point 
for the development of stress tests should be an analysis of the volatilities of these 
rates and estimations for bounds on deviations for them. The statistical analysis 
should be accompanied by investigations of the reasons for the deviations. It 
should be studied which fraction of the deviations arise from the methodology 
of the rating processes and which from changes in the economic, political, 
etc. environment. Expert opinion should be used to estimate bounds for the 
deviations arising from the methodology. Statistical analysis should lead to an 


16 Development of Stress Tests for Credit Portfolios 369 


identification and quantification of the exogenous risk factors having the main 
impact on the risk parameters needed to determine capital requirements. The 
combination of these two procedures should enable the establishment of uniform 
stress testing. 

The analysis of default and loss data with respect to estimating deviations from 
the risk parameters should be followed by statistical analysis of the dependence of 
these deviations on risk factors and an identification of the most relevant factors. 
For the latter, first considerations of historic events which are known to have a 
large impact on portfolio risk should also be taken into account. These investiga- 
tions should culminate in a macroeconomic model for the dependence of risk 
parameters on risk factors. With this model sensitivity, analysis for risk factors 
can be performed. The outcome of these stress tests can be used to check whether 
the uniform stress tests involve sufficient variations of risk parameters to cover 
the results of univariate stress tests. As a consequence, the uniform stress tests 
might have to be readjusted. Moreover, the sensitivity analysis should also be 
used to check whether the chosen risk factors are contributing to drastic changes 
in the portfolio. If this not the case, they should be neglected for further stress 
tests. 

The involvement of relevant risk factors should also be a good criterion for 
picking historical and hypothetical scenarios. It makes sense to consider historical 
scenarios first in order to benefit from the experience with historical data. This 
experience should also include the consideration of the interplay of different kinds 
of risks like market, credit, operational, liquidity risk, etc. The design of hypotheti- 
cal scenario analysis should be seen as the highlight and culmination point of the 
stress testing framework. 

Scenario analysis based on statistical analysis is a method which is not 
connected too closely with the others. Nevertheless, a lot of preliminary work has 
to be done to generate reliable tests of this kind. The main problem is the generation 
of probability distributions for the risk factors, in particular joint distributions and 
distributions conditioned on actual (economic) situations. 

The evolutionary approach towards a feasible framework for stress testing can 
be summarized by the chart in Fig. 16.1. 

Having established a stress testing framework, we recommend 


e Regular uniform stress tests for regulatory and economic capital in order to 
provide a possibility for evaluating the changes made to the portfolio in terms of 
possible extreme losses, and 

e Hypothetical scenario analysis suitable to the actual portfolio structure and the 
conditions provided by the economy, politics, nature, etc. 


The latter should partly be combined with stress tests on market and liquidity 
risk. Also, effects on reputational and other risks should not be neglected. 
Furthermore, one should have in mind that a crisis might have a longer horizon 
than 1 year, the typical period for evaluations of risk, even in the common stress 
scenarios. 
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Fig. 16.1 Development of a stress testing framework 
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Chapter 17 
Risk Management of Loans and Guarantees 


Bernd Engelmann and Walter Gruber 


17.1 Introduction 


In previous chapters, the estimation of the key loan risk parameters was presented. 
In Chaps. 1-3, and 5 estimation methods for 1-year default probabilities were 
discussed. In typical credit risk applications, however, a 1-year horizon is insuffi- 
cient. In Chap. 6 it was shown how to compute multi-year default probabilities with 
the help of transition matrices. In Chap. 7 techniques to estimate default probabil- 
ities and loss given default rates simultaneously were discussed while Chaps. 8 and 
9 presented methods for loss given default estimation. 

In recent years, banks have invested considerable effort on building up data bases, 
constructing rating systems and estimating the credit risk parameters PD, LGD, and 
EAD from the collected data. This work was mainly driven from regulatory con- 
siderations. Under the Basel II capital accord banks are allowed to calculate their 
regulatory capital from these risk parameters if the estimation procedures fulfil the 
quality requirements of supervisors. This might lead to capital reductions compared 
to the old framework if the credit quality of a bank’s debtors and the quality of 
collateral that is used to back a bank’s loan portfolio is sufficiently high. 

In our view, it would be a waste of effort if the estimated risk parameters would be 
used mainly for regulatory purposes. In this chapter, we show how they can be used 
to price loans and guarantees. We show how the basic pricing formulas can be used to 
compute the terms of a loan, how the premium of a guarantee can be determined, and 
how the model can be used to calculate general loss provisions in a consistent 
and economically meaningful way, i.e. how the model can be used in managing the 
risk of credit losses. 
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Often loan portfolios and bond portfolios are treated very differently by a bank. 
From an economic perspective this does not make sense because the characteristics 
of both products are very similar. Both products consist of a stream of deterministic 
cash flows that are subject to default risk. The main difference is that bonds are 
tradable in contrast to most loans. For this reason we suggest an approach for the 
pricing and risk management of loans that is structurally very similar to bonds. 
Since it is not possible to observe spreads, e.g. asset swap spreads or CDS spreads, 
for most debtors in the loan market, we use default probabilities for the measure- 
ment of a debtor’s credit quality instead. 

This chapter consists of three sections. In the first section, we explain the 
pricing formulas for loans and guarantees and illustrate their use for the most 
popular loan structures, bullet loan, instalment loan and annuity loan. In the second 
part we explain how to compute the terms of a loan. Our scheme is based on 
the RAROC (risk-adjusted return on capital) concept. Further, we show how to 
compute general loss provisions for a loan portfolio dynamically. We conclude this 
article with a short discussion of our loan pricing framework in the light of the 
recent financial crisis. 


17.2 Pricing Framework 


In this section we explain the pricing formulas and the input data of these formulas. 
In the first part we discuss the pricing of loans and in the second part we state the 
formulas for guarantees. 


17.2.1 Pricing of Loans 


We explain the valuation of a loan that is characterized by interest rate payments 
that might be either fixed or floating and a deterministic amortization schedule. 
A deterministic amortization schedule implies that a loan does not contain any 
embedded options like prepayment rights. This case is treated in detail in Chap. 18. 
Under this assumption the value of a loan is the discounted expected value of 
all future cash flows. The future cash flows consist on the one hand of the interest 
rate and amortization payments, on the other on the recovery in the case of a default 
of the debtor. We find for the value V of a loan the expression 


n 


V= (zi + ti- Ni + Ai) -df (Tj) - q(Ti) 


+ ORM | df(t) - (q(t) — q(t-+ di) 017.1) 
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With 7),...,7,, we denote the future interest rate payment dates of the loan, where 
T,, is the repayment date of the loan’s outstanding notional. With z; we denote the 
annualized interest rate corresponding to period i which might be either fixed or 
floating,’ with A; the amortization payment in period i, with 1; the year fraction of the 
i-th interest rate period, with df(T) the discount factor corresponding to time T, with 
q(T) the survival probability of the debtor up to time T, with N; the loan’s notional 
that is outstanding in period /, and with R; the recovery rate corresponding to period 
i, i.e. the percentage of the notional that can be recovered by the creditor if the debtor 
defaults. This recovery rate might be period dependent. Suppose a loan is backed by 
collateral and the value of this collateral is assumed constant throughout the lifetime 
of the loan. If the loan is amortizing then the percentage of the loan that is 
collateralized is increasing in time. Therefore the recovery rate is increasing in time. 

The discount factors are computed from an interest rate curve that can be 
considered as approximately risk free. Often the swap curve is taken as a reference 
curve. This curve is certainly not risk free because it reflects the credit risk in the 
interbank market. It is nevertheless a reasonable choice for a reference curve 
because it reflects the funding conditions of banks. An additional spread reflecting 
the debtor’s credit quality is not included into the discounting in (17.1). The 
debtor’s credit quality is included by his time-dependent survival probability only. 

The two terms in (17.1) can be interpreted intuitively. The first term is the 
discounted expected value of all interest rate and amortization payments. Each 
interest rate and amortization payment is discounted according to its payment date 
and weighted with the probability of its occurrence, i.e. the survival probability of 
the debtor. The second term is a bit more complicated. It models the recovery if the 
debtor defaults. In contrast to the interest rate payments the default time is not 
known in advance. Therefore, we have to compute the default probability of each 
small time interval in the future, weight it with the discounted value of the recovery, 
and sum over all future small time intervals. In its exact form this leads to an 
integral. It is possible to approximate this integral by an easy to evaluate formula. 
We assume that on average a debtor defaults in the middle of each interest rate 
period, i.e. at the time t = 0.5-(T; + T;_1). We discount the recovery from the 
period mid and weight the result with the probability that a debtor defaults in the 
period. This leads to 


R;-Ni- | df (t) - (q(t) — q(t+ dt)) ~ Ri-Ni- df (0.5 - (Ti +Ti-1)) -(q(Ti-1) — 4(T;)) 


(17.2) 


'A floating interest rate is often directly linked to a Libor rate that is fixed at the beginning of each 
interest rate period. In this case z; can be computed as a forward rate from the discount curve that is 
extracted from the swap market. In other cases, the bank has some freedom to decide about when 
to increase or decrease a floating interest rate. Here some assumption has to be made how the 
bank’s decision is linked to the forward rates implied from the swap curve. 
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In a concrete implementation we have to define Tọ = 0. The approximation 
(17.2) is easy to implement and very accurate. 

We remark that we do not model the complicated process of liquidating a loan’s 
collateral explicitly. We assume that at the time of default the creditor will receive 
the discounted value of all future payments from liquidating collateral minus the 
associated costs with the liquidation process. The complexity of the liquidation 
process is therefore reflected in the estimation of the recovery rate R;, not in the 
formulas for loan valuation. 

Formulas (17.1) and (17.2) look pretty simple because of the probabilistic 
assumptions we have implicitly used without stating them yet. First, we can write 
expected discounted values of future cash flows by weighting the cash flows with 
the product of discount factor and survival probability of the debtor because of the 
assumption that defaults and interest rate dynamics are independent. Second, we 
can write the expected recovery in the case of a default as the product of recovery 
rate and default probability because we have implicitly assumed that default 
probabilities and recovery rates are independent. The first assumption might be 
problematic for banks selling mostly floating rate loans because rising interest rates 
should lead to an increase in default rates in this context. The second assumption is 
also in contradiction to empirical literature (Frye 2000 and 2003) and to the 
observation of falling house prices in connection with high default rates during 
the recent financial crisis. This should be kept in mind when parameterizing this 
simple formula. We will come back to this point in the final section. 

For the practical application of (17.1) and (17.2) we have to estimate discount 
factors, recovery rates, and survival probabilities. The easiest part is the determina- 
tion of discount factors. They are computed from quotes of interbank market 
instruments like deposits, interest rate futures, or swaps. These quotes are available 
every day and can be accessed by market data providers like Bloomberg or Reuters. 
As already outlined before, this interest rate curve is suitable for loan valuation 
because the interbank curve serves as the reference for determining the funding 
conditions of a bank. 

More difficult is the estimation of the survival probability (or equivalently, the 
default probability) of a debtor. There are basically three possibilities of estimating 
survival probabilities 


e Direct estimation of survival probability term structures 
e Extrapolation from transition matrices 
e Extraction from bond or credit default swap spreads 


First, term structures of survival probabilities can be estimated directly. If the 
survival probabilities for different rating grades are known for a number of years 
and a reasonable parameterization for the shape of this term structure is given, it 
might be calibrated for each rating grade separately. Second, term structures of 
survival probabilities can be easily extrapolated from 1-year transition matrices 
under the assumption of Markovian rating transitions and time-homogeneity. This 
is explained in detail in Chap. 6. Finally, in rare cases of debtors with bond 
issues outstanding, the term structure of survival probabilities can be inferred 
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from bond or credit default swap spreads. Detailed explanations on the calculation 
of survival probabilities from credit default swap spreads can be found in Brigo and 
Mercurio (2006). 

The final parameter needed is the recovery rate. This parameter typically is 
determined from the collateral of a loan. For each type of collateral a separate 
recovery rate is estimated from the data of defaulted debtors. From the recovery 
rates of each type of collateral and the recovery rate for the unsecured part of the 
loan a net recovery rate for the loan will be computed. The basic principles of 
estimating recovery rates are explained in more detail in Chap. 11. 

We conclude this section with the specification of (17.1) for the three most 
common loan types, bullet loan, instalment loan, and annuity loan. The simplest 
loan type is the bullet loan with initial notional N. Here no amortization prior to the 
repayment of the notional at the loan’s expiry takes place. We have 


Nj; =N 
0, i<n 
Aj = 
N, i=n 
A fixed interest rate 
= fi +m, floating interest rate 


where f; is the forward rate corresponding to the i-th interest rate period and m is the 
margin (or spread) over Libor the debtor has to pay for the loan if the interest rate is 
floating. Compared to the bullet loan, the instalment loan has a fixed amortization 
payment in addition to the interest rate payments in every period 7. This amortiza- 
tion payment is specified by a constant annualized amortization rate a. We find for 
the instalment loan 


A : N 
=e 
k 
N; = max(0,N — (i — 1) - A) 
{ A, i<n 
A= 
Nn, i=n 
_ fz fixed interest rate 
‘| f+m, floating interest rate 


where k is the number of interest rate (and amortization) payments per year. Of 
course one has to make sure that N; is always non-negative. For an instalment 
loan the amortization payment is constant over its lifetime while the interest rate 
payments are reduced due to the reduction in the loan’s outstanding notional. 
Therefore, the sum of amortization and interest rate payments is not constant over 
the lifetime of an instalment loan. This is the case for an annuity loan. To ensure 
that the sum of interest and amortization payments is constant over the loan’s 
lifetime the interest rate has to be fixed to a value z. We get for the annuity loan 
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i i=0 
N; = 
Nj-1 — Ai, i>0 


where K is the constant sum of interest and amortization payments in each period. 
Note that it is not possible to compute A; and N; independent of each other for an 
annuity loan. Both quantities have to be computed recursively starting from i = 0. 


17.2.2 Pricing of Guarantees 


In this section, we explain the pricing formulas for guarantees. In a guarantee a 
bank or some other financial institution provides insurance against the default of a 
debtor of a loan. In the case of a default the guarantor will either pay for the loss of 
the loan or take over the loan at par, i.e. buy the loan and pay the outstanding 
notional for it. For insuring the loan against defaults the guarantor gets a premium g 
which is paid periodically and is proportional to the loan’s outstanding notional. In 
this respect a guarantee is very similar to a credit default swap. The only difference 
is that the underlying of a credit default swap is one or more bonds of a company or 
a state while the underlying of a guarantee is a loan. 

The pricing of a guarantee is very similar to the pricing of a loan. Its value is the 
expected discounted value of all future cash flows. We take the perspective of a 
guarantor who receives premium payments and has to buy the loan at par in the case 
of a default. We assume that premiums are paid at the end of each period and that in 
case of a default no premium has to be paid for the period where the default 
occurred.” 

The equivalent to (17.1) for the value G of a guarantee is 


-PARN | so -ale+ay) a13 


?There might be other conventions in the market concerning premium payments, i.e. the premium 
might be paid at the beginning of each period or in the case of a default the premium for the period 
where the default occurred has to be paid up to the default time. In this case, the formulas we derive 
for the expected present value of premium payments have to be slightly modified to properly 
reflect the different convention that is used. 
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Here, g is the annualized premium that has to be paid for the guarantee, n is the 
number of future periods of the guarantee’s lifetime, T; are the payment dates of the 
premium, N; is the loan’s outstanding notional in period i, df(T) is the discount 
factor and q(T) the survival probability of the loan’s debtor corresponding to time T. 
The first term in (17.3) is the expected present value of the premium payments that 
are paid if the debtor survives and the second term is the loss of the guarantor 
if the debtor defaults. Similar to (17.2) the integral in (17.3) can be simplified 
by assuming that on average a debtor defaults in the middle of each period. This 
leads to 


(L-R) i> | df(t) (ale) -a+ di) 
~ (1 — Rj) -Ni - df (0.5 - (Ti + Ti-1)) - (4(Ti-1) — 4(T:)) (17.4) 


Again we end up with an easy to implement formula. 


17.3 Applications 


In this section we outline how to apply the pricing formulas that were derived in 
Sects. 2.1 and 2.2 in banking practice. In the next section, we explain a calculation 
scheme for a loan’s terms based in RAROC (risk-adjusted return on capital). After 
that, we show how to compute general loss provisions for a loan portfolio in an 
economically meaningful way. Both applications are illustrated with concrete 
numerical examples. 


17.3.1 Calculation of a Loan’s Terms 


As already outlined above, we explain a scheme for calculating a loan’s terms based 
on the performance measure RAROC. RAROC measures the revenues of a loan 
in relation to its risk. In our context, risk is defined as the economic capital that 
is needed as a buffer against unexpected losses of the loan. Economic capital is 
typically measured by the risk contribution of a loan to the total credit risk of a bank 
that is typically computed as the value-at-risk or the expected shortfall of a loss 
distribution that is generated by a credit risk model. A good introduction to credit 
risk modelling is Bluhm et al. (2003). Risk measures like value-at-risk and expected 
shortfall and their properties are analyzed in Artzner et al. (1999) and Tasche 
(2002), while good references for capital allocation techniques are Kalkbrenner 
(2005), Kurth and Tasche (2003), and Kalkbrenner et al. (2004). 
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Using a simple one-factor credit risk model, Gordy (2003) has derived an 
analytical formula for the economic capital related to a loan exposure which is 
also used in the Basel II capital accord (BCBS 2004) for the calculation of 
regulatory capital under the internal ratings based approach. Using the exact para- 
meterization of Basel II, the economic capital E per loan exposure is computed as 


1 _ p = 
E=(1—R)-(® - D-7! (PD) + r 0 )) = PD) - om 
a- (o= E oo) m) 
1 + (max[1; min[M; 5]] — 2, 5) - (0, 11852 — 0, 05478 - log(PD))? 
1 — 1, 5- (0, 11852 — 0, 05478 - log(PD))? 


b(M) = 


(17.5) 


where PD is the 1-year default probability of the debtor, R the recovery rate 
corresponding to a default within 1 year, p is the mean asset correlation among 
all debtors of a bank, « the confidence level where the value-at-risk of the loss 
distribution is computed, and ® the cumulative distribution function of the standard 
normal distribution. For p and « it is possible to use own estimations or the values 
given in the Basel II accord. The term b is the maturity adjustment that reflects 
the increased risk of a loan with a higher maturity. One assumption behind the 
derivation of this formula was the absence of concentration risk in the loan 
portfolio. If the loan portfolio contains significant concentrations, the formula can 
be adjusted by adding an additional factor for granularity. A possible way to 
compute this add-on for volume concentration risk is described in Gordy and 
Lutkebohmert-Holtz (2007). 

A loan’s revenue is computed as the difference of the interest earned and the 
costs associated with the loan. If the loan would be riskless and all internal costs and 
the funding spread would be zero, the interest rates z; would have to be equal to the 
swap rate s corresponding to the loan’s maturity to bear the loan’s funding costs. 
This swap rate is computed as 


_ 1 -4f (Tn) 
3 ti + df (Ti) 


i=1 


(17.6) 


Using s as a base rate, the total margin of a loan is defined as m = Zeg—s. Here, 
Zeg is defined as the period-independent fixed (“effective”) interest rate that defines 
a fixed-rate loan that has the same value as the loan with interest rates z;. To be more 
specific, Ze is defined from the condition 


3 (zef a f ak Ni + Aj) s df (Ti) x q(T) = 5 (zi Gps Ni + Aj) x df (T;) ‘ q(T;) 


i=] i=] 


which can be solved explicitly for zeg : 
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247i Tj -Ni 4 df (T;) 4 q(T;) 
Zeff = = 


Dr -Ni -df (Ti) - q(T) 


The costs associated with a loan are on the one hand the risk costs to cover the 
expected loss of the loan, on the other hand all other costs of the lender (funding 
spread, operating costs, internal fees, etc.). The risk costs can be computed from 
(17.1) applying the condition V = N. If the discounted expected value of all future 
cash flows equals the outstanding notional, it is ensured that expected losses are 
covered. 

In Sect. 2.1 we considered a bullet loan, an instalment loan, and an annuity loan 
as the most popular loan types explicitly. Calculating risk costs for a bullet loan 
with a period-independent fixed interest rate leads to 


1 = df (Ta) -4(Ta) — S2R)- gf (OS - (Th + Te-1)) « (@(Te-1) — aT) 
r= = s 


Dr -df (Ti) -a(T) 


For an instalment loan we report the formula for the case of a floating interest 
rate 


n 


N — 5 (4i + fi- ti- Ni) -df (Ti) - a(Ti)- 


i=l 


a Ni + df (T;) - q(T) 


r= 


IR: N; df (0.5 - (T; + Tit) « (q(Ti-n) — 4(T) 


i=l 


Sr -Ni + df (T;) -q(T 


In this case the risk cost r is the spread over Libor that must be charged by a bank 
to cover expected losses of the loan. 

For an annuity loan it is impossible to calculate the risk costs analytically 
because the amortization schedule depends on the level of the fixed interest rate. 
Here a zero-search algorithm has to be applied for the calculation of risk costs. 

Finally all other cost components must be transformed into an annualized cost 
margin per notional. This cost margin is denoted by c. Since internal cost structures 
differ from bank to bank there is no general rule how to aggregate cost components 
to a cost margin. It might be possible that the largest part of the internal costs has to 
be paid as an upfront payment by the debtor, or that no upfront payment is required 
and all costs are included into the loan’s interest margin, or that some combination 


382 B. Engelmann and W. Gruber 


of these two extremes is applied. In any case a bank has to ensure that the expected 
present value of future cost payments covers all internal and refinancing costs. To 
make this clear we take the funding spread s,, a bank has to pay as an example. We 
assume that a bank has to pay the funding spread for a loan until the loan’s maturity 
regardless if the debtor defaults. Under this assumption the cost margin Cyu 
corresponding to funding costs can be computed from the equation 


SON a Sfu Ca ie df (T;) = SON i Cfu is i df (T;) $ q(T;) 
i=1 i=1 


The left hand side is the funding spread that is paid by the bank over the loan’s 
lifetime. The right hand side equals the expected present value of the cost margin 
payment by the debtor. The cost margin will be paid unless the debtor is in default. 
For the cost margin corresponding to funding costs we get the explicit formula 


n 
Sfu © >Ni -ti -df (T;) 
Cfu =h = 


LN -ti + df (Ti) - q(Ti) 


Similar formulas have to be derived for other cost components. After that all cost 
margins corresponding to all cost components have to be aggregated to the total 
cost margin c. 

Using the above margin components, we compute the RAROC of a loan as 

—r-c 


RAROC = To (17.7) 


RAROC measures the return on economic capital that is realized by selling a 
loan for a total margin of m. Typically, a bank defines a minimum level of the return 
on economic capital that must be reached to consider a loan as profitable. This 
minimum level is called the hurdle rate A. From the hurdle rate, the minimum 
margin that must be gained by a loan investment can be computed as 


Mpin =r +h- -E+c (17.8) 


In (17.8) the definition of the minimum margin ensures that a loan is profitable 
for a bank. If for some reasons a loan is sold below the minimum margin, the effect 
on the realized return on economic capital can be measured by (17.7). The three 
components in (17.8) can be interpreted intuitively. The first component is a 
compensation for expected losses, the second component is a compensation for 
unexpected losses, and the final component is a compensation for all other costs that 
are related to a loan. 

The RAROC scheme for a loan can also be applied to a guarantee. The only 
modification is that the risk costs of a guarantee are computed from the condition 
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G = 0 where G is value of the guarantee that was defined in (17.3). All other 
parts of the RAROC pricing formula for a loan are identical for a guarantee. 

We conclude this section with an illustrative example. We consider an instal- 
ment loan with a maturity of 15 years. The loan’s interest rate is 3M Libor plus a 
margin, i.e. the loan pays interest quarterly. Further, the annual amortization rate of 
the loan is 4%. We assume that a bank computes economic capital according to the 
Basel II formulas in (17.5) with the parameterization of Basel II for residential 
mortgages. Further, we assume that the bank uses a hurdle rate h of 10% and its 
internal and funding costs are properly reflected by a cost margin of 1%. The bank’s 
rating system is described by the transition matrix in Fig. 6.7 of Chap. 6, i.e. the 
bank uses nine rating grades where the final grade is the default grade. We further 
assume that the value of the collateral that is posted by a debtor equals 40% of the 
original loan amount. This means that the recovery rate is increasing in time 
because of the amortization effects. Finally, we assume a flat zero rate of 5% 
with annual compounding to compute discount factors. 

To generate the survival probabilities in (17.1) and (17.2) itis necessary to multiply 
the 1-year transition matrix with itself to compute multi-year default probabilities as 
described in Chap. 6. We report the resulting multi-year default probabilities over 15 
years for each rating grade in Table 17.1. From these default probabilities we can 
compute the term structures of survival probabilities that are needed in (17.1) and 
(17.2). Since our loan pays interest quarterly we also need survival probabilities at 
intermediate points in time. They can be generated by computing transition matrices 
corresponding to year fractions as explained in Chap. 6. An alternative could be a 
simple interpolation scheme like linear interpolation of the logarithms of the survival 
probabilities which approximately leads to the same result. 


Table 17.1 Multi-year cumulative default probabilities for each rating grade computed from the 
1-year transition matrix in Fig. 6.7 of Chap. 6 


Time (years) Rating grade 
1 2 3 4 5 6 Hi 8 


1 0.0000 0.0008 0.0009 0.0036 0.0167 0.0496 0.1490 0.2496 
2 0.0002 0.0016 0.0022 0.0085 0.0359 0.1010 0.2673 0.4173 
3 0.0004 0.0026 0.0040 0.0148 0.0572 0.1516 0.3612 0.5325 
4 0.0007 0.0037 0.0063 0.0223 0.0799 0.2001 0.4361 0.6134 
5 0.0011 0.0051 0.0091 0.0309 0.1037 0.2455 0.4964 0.6718 
6 0.0017 0.0066 0.0125 0.0405 0.1279 0.2875 0.5454 0.7149 
7 0.0024 0.0085 0.0164 0.0509 0.1522 0.3261 0.5856 0.7476 
8 0.0033 0.0106 0.0209 0.0621 0.1762 0.3614 0.6191 0.7729 
9 0.0043 0.0130 0.0259 0.0738 0.1999 0.3936 0.6472 0.7931 
10 0.0055 0.0157 0.0314 0.0861 0.2229 0.4230 0.6711 0.8094 
11 0.0069 0.0187 0.0374 0.0987 0.2451 0.4498 0.6917 0.8230 
12 0.0086 0.0221 0.0439 0.1117 0.2665 0.4743 0.7095 0.8344 
13 0.0104 0.0258 0.0508 0.1248 0.2871 0.4966 0.7252 0.8442 
14 0.0125 0.0299 0.0581 0.1380 0.3068 0.5172 0.7390 0.8526 


15 0.0149 0.0342 0.0658 0.1513 0.3256 0.5360 0.7514 0.8600 
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Table 17.2 Cost components of the 


: : Rating Risk costs Opp. costs Minimum margin 

instalment loan for the calculation of de (%) (%) (%) 

he loan’s terms in the example for 2i 

ma E 1 0.0193 0.0469 1.0662 
2 0.0594 0.0888 1.1482 
3 0.1137 0.1023 1.2161 
4 0.3330 0.2864 1.6194 
5 1.0146 0.8258 2.8404 
6 2.4618 1.5762 5.0380 
7 6.1548 2.5831 8.7380 
8 10.8426 2.9259 14.7684 


In Table 17.2, we report the minimum margins that must be charged by a bank to 
cover all its costs according to (17.8). Since the loans have a floating interest rate, 
the minimum margins are the spreads over Libor that must be paid by a debtor. To 
see the main drivers of the minimum margin we report all cost components. The 
cost margin (corresponding to c in (17.8)) which is not reported in Table 17.2 is 
independent of the rating grade and by assumption equal to 1%. In the table, we 
report the risk costs (r in (17.8)) and the opportunity costs of capital (h-E in (17.8)). 
We see that the risk costs are lower than the capital costs for the good rating grades 
while it is vice versa for the poor rating grades. For poor rating grades the expected 
loss is already rather high because of the high default probabilities. Therefore, the 
surprise component of unexpected losses is relatively low for these rating grades. 


17.3.2 Calculation of General Loss Provisions 


We start this section by describing a simple framework for managing the risks of 
credit losses in our model framework. We explain the general principle without 
going deeply into accounting details which depend on the specific accounting 
framework that is applied by a bank. 

The aim is to manage loan portfolios in a way that ensures that a bank does not 
suffer losses even if defaults in the loan portfolio occur. For this, the component of 
the interest margin that reflects the expected loss risk of a loan is collected and 
stored in an “expected loss account”. When defaults happen, the values of the 
defaulted loans are corrected to their expected recovery values. If the estimated 
default probabilities and recovery rates are on average close to their realizations, 
the sum of the proceeds from liquidating the collateral of defaulted loans plus the 
money on the expected loss account are on average close to the present value of the 
bonds that were issued to fund the loan. 

In this context, “on average” means that during a recession realized default rates 
are higher than default probabilities, while during a boom they are lower. There- 
fore, booms should be used to fill the expected loss account for the bad times when 
it is needed. Furthermore, “on average” means that this approach will only work 
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well for large loan portfolios, e.g. in the retail sector. A bank with a highly 
specialized business and a relatively low number of loans will have difficulties in 
implementing this approach because if the number of loans is small, it is expected 
that realized default rates are in general less in alignment with default probabilities 
which makes this approach of managing loan loss risks less accurate. 

Usually credit losses do not occur by surprise but downgrades of debtors signal 
an increase in default probabilities early. If the valuation of a loan is done by (17.1) 
changes in the rating of a debtor and changes in interest rates are directly reflected 
in an increase or decrease of a loan’s value. This allows the building of loss 
provisions for a loan portfolio to make the process of realizing unexpected losses 
smooth. For this, the loan portfolio has to be valued in regular intervals and 
provisions for losses have to be build. For the calculation of the loss provision 
per loan the condition V = N that was used to compute the risk margin in the last 
section will be applied. Suppose a loan was sold for an interest rate z if it is a fixed 
rate loan or for a margin m over Libor if it is a floating rate loan. We define the 
interest rate z, (or the margin m,) that contains the risk costs only by 


Z=z—-h-E-c 
(17.9) 
m=m—h-E-c 


If the loan is sold for an interest rate computed by (17.8) then using the interest 
rate (17.9) and valuing the loan using (17.1) leads to a loan value V equal to N, the 
initial notional. If the loan is valued at a later stage, the changes in interest rates and 
in the debtor’s credit quality might lead to risk costs that are not reflected in (17.9). 
Therefore, valuing the loan using the interest rate (17.9) and subtracting the 
outstanding notional of the loan from the result gives the gain or loss in the 
loan’s value. This is a reasonable quantity for building a provision. 

We will illustrate this concept with a simple example. We consider a portfolio of 
ten instalment loans that were all sold on March 31, 2009. All loans have the 
structure of the example loan in Sect. 3.1., i.e. instalment loans with a maturity of 15 
years and an amortization rate of 4%. The rating of each debtor and the initial 
notional of each loan are reported in Table 17.3. 


Table 17.3 Example 


: 7 Number of loan Debtor rating Initial notional 
‘seas of 15Y instalment i i 1,000,000 
2 2 500,000 
3 3 750,000 
4 3 750,000 
5 3 1,000,000 
6 4 750,000 
7 4 600,000 
8 5 400,000 
9 5 750,000 
10 5 500,000 
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The loan portfolio has a total notional of seven million. We further assume that 
each loan was sold for the minimum margin that was computed in Table 17.2. We 
compute the general loss provision for this portfolio at April 01, 2012. We assume 
that interest rates at this date are given by a flat zero curve of 4% with annual 
compounding. Since the loans have a floating interest rate their interest rate 
sensitivity is moderate. Over the 3 years the ratings of some debtors have improved 
while the credit quality of other debtors has deteriorated. As in Sect. 3.1 we assume 
that the economic capital for each loan is computed by the Basel formula (17.5). 
This means that the economic capital changes with the rating. When we compute 
the interest rate (17.9) we use the economic capital corresponding to the current 
rating. Finally, we assume that the bank’s cost structure did not change and 1% is 
still the appropriate cost margin. We report the value of each loan when it is priced 
using the interest rate (17.9) together with the outstanding notional in Table 17.4. 

If the rating grade remains unchanged as in the case of loan 2 the value of the 
loan increases. The reason is that after 3 years the expected loss margin 
corresponding to a then 12-year loan is less than the original margin corresponding 
to a 15-year loan. This leads to an increase in the loan’s value. In this example, 
however, the increase is mild because the expected loss for a debtor in rating grade 
2 is rather small. 

Overall we see that the credit quality of the portfolio has deteriorated and some 
loans are worth considerably less than the outstanding notional when the pricing is 
done with the interest rate (17.9). The reason is that both the opportunity cost of 
capital and the expected loss margin calculated under the new rating have 
increased. In total we find a loss in portfolio value of 147,032. This should be 
reported as the general loss provision for this portfolio in the balance sheet of the 
bank. This number compares to a total of expected loss margins that have been 
earned by the bank over the 3 years after the loan portfolio has been sold of 69,653." 


Table 17.4 Valuation of the loan portfolio of Table 17.3 after interest rates and ratings have 
changed three years after the loans in the portfolio were sold 


Number of loan Debtor rating Loan value Outstanding notional P&L 

1 4 852,043 880,000 —27,957 
2 2 440,752 440,000 752 
3 7 446,046 660,000 —213,954 
4 1 667,668 660,000 7,668 
5 2 885,774 880,000 5,774 
6 3 680,791 660,000 20,791 
7 3 544,633 528,000 16,633 
8 T 262,234 352,000 —89,766 
9 2 740,948 660,000 80,948 
10 3 492,079 440,000 52,079 


3We have neglected interest rate effects when computing this number. It is just the sum over the 
expected loss margins that were charged by the bank. 
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Overall, the expected loss margins earned so far do not cover the deterioration in the 
loan portfolio’s value at this point in time. 

Finally, we remark that computing the value of a loan by (17.1) using the interest 
rate (17.9) is not only useful for the calculation of loss provisions. This is also the 
price that should be paid for a loan when it is sold. The reason is that the interest 
paid by a loan must cover the desired return on economic capital and the internal 
costs of a bank. Therefore, these cost components have to be subtracted from the 
total margin of a loan when computing the expected discounted value of its future 
cash flows. 


17.4 Discussion 


In this article we have presented a framework for the risk-adjusted pricing of loans 
and guarantees. To conclude this article we want to discuss its usefulness for 
practical applications. The recent financial crisis has led to a debate on the quality 
of models that are applied by banks. The opinions range from blaming models and 
their creators to be a main driver of the crisis to the other extreme that still not 
enough models are used properly by banks to measure and monitor the risk’s of its 
business. 

We take one aspect of the origin of the financial crisis, the lending behaviour of 
American banks. Basically loans were granted to home buyers of poor credit quality 
because it was believed that house prices cannot fall and in the case of a default the 
sale of the house will make up for the loss. We will analyze in the sequel how the 
use of a model will affect the business of a bank under these assumptions. 

We use again the instalment loan of Sect. 3.1 as an example. This time we 
assume that the value of the collateral is equal to the loan’s initial notional and we 
assume that if the collateral is liquidated that it cannot be worth more than the 
loan’s outstanding notional, i.e. that the recovery rate in (17.1) can never exceed 
100%. We compute Table 17.2 again under this assumption. 

The results are presented in Table 17.5. Not surprisingly we find that the 
minimum margin is basically independent of a debtor’s credit risk. The only diff- 
erence comes from the risk of the creditor of losing an interest rate payment. 


Table 17.5 Recalculation 


Rating grade Risk costs Opp. costs Minimum 
of Table 17.2 under the (%) (%) margin (%) 
oe. oo 1 0.0011 0.0000 1.0011 

2 0.0017 0.0000 1.0017 

3 0.0025 0.0000 1.0025 

4 0.0056 0.0000 1.0056 

> 0.0142 0.0000 1.0142 

6 0.0311 0.0000 1.0311 

7 0.0710 0.0000 1.0710 

8 0.1202 0.0000 1.1202 


388 B. Engelmann and W. Gruber 


Table 17.6 Recalculation 


Rating grade Risk costs Opp. costs Minimum 
of Table 17.3 under the (%) (%) margin (%) 
eis i AE 1 0.0338 0.0000 0.9662 

2 —0.0774 0.0000 0.9226 

3 —0.1548 0.0000 0.8452 

4 —0.3602 0.0000 0.6398 

5 —0.7839 0.0000 0.2161 

6 —1.3322 0.0000 —0.3322 

7 — 1.9483 0.0000 —0.9483 

8 —2.3547 0.0000 —1.3547 


If a debtor defaults the bank has in general unlimited access to the collateral. 
This could in principle lead to recovery rates greater than 100% if a default happens 
after some amortization payments are made and the assumption that a house price 
cannot fall below the initial notional is true. Under this assumption we get the 
margins of Table 17.6. 

The results in Table 17.6 are counterintuitive. Now the model tells us to favour 
debtors with low credit quality, and they even should be paid interest for the loan. 
The reason is that under this assumption a default is more profitable than earning 
interest over the full lifetime of a loan. The profit comes from the amortization 
payments. The bank makes the highest profit if a debtor defaults after he has made 
some amortization payments. This profit can be achieved more likely with debtors 
of low credit quality. Therefore, they even should get a fee for entering the loan 
instead of paying interest. 

From an economic point of view Tables 17.5 and 17.6 deliver reasonable results. 
However, especially the assumptions underlying Table 17.6 result in a business 
model that no one would undertake. In this context a model is just a tool to translate 
assumptions about markets into a business model in a transparent way. It is still the 
task of a risk manager to judge if the resulting business model is reasonable or if 
it is not. The latter case should lead to a questioning of the assumptions underlying 
the model. 

The assumptions underlying the pricing model (17.1) can be questioned in two 
ways. It is known from empirical studies by Fry (2000, 2003) that high default rates 
are historically accompanied by low recovery rates, i.e. that the assumption of 
independence between default and recoveries is wrong from an empirical point of 
view. This is also analyzed in Chap. 7. To improve the pricing model (17.1) one 
could either model the correlation between default and recovery explicitly which 
would result in a more complicated model or use a more conservative parameteri- 
zation, i.e. instead of using an average LGD one should use a conservative estima- 
tion of a LGD (“downturn LGD”) to acknowledge this effect. 

If we modify the example of Table 17.6 by restricting recovery rates to 100% 
and assuming that the collateral is worth 80% of the initial notional in the case of a 
default only, which still results in a rather high collateralization, we get the 
minimum margins for each rating grade that are reported in Table 17.7. In addition 
we compute Table 17.7 for a bullet loan, i.e. we set the amortization rate to zero. 
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Table 17.7 Recalculation Rating grade Risk costs Opp. costs Minimum 
of Table 17.3 under the (%) (%) margin (%) 


assumption of a collateral 


wos 1 0.0018 0.0156 1.0174 

Mile SEMEL 5 0.0065 0.0296 1.0361 

3 0.0102 0.0341 1.0443 

4 0.0341 0.0955 1.1295 

5 0.1326 0.2753 1.4079 

6 0.3952 0.5254 1.9206 

ih 1.2725 0.8610 3.1336 

8 2.5336 0.9753 4.5089 
Table 17.8 Recalculation Rating grade Risk costs Opp. costs Minimum 
ses ate ce ae (%) (%) margin (%) 
additional assumption of no 
amortization nea h 0.0139 0.0156 1.0295 

2; 0.0352 0.0296 1.0648 

3 0.0690 0.0341 1.1031 

4 0.1785 0.0955 1.2740 

5 0.4690 0.2753 1.7443 

6 1.0115 0.5254 2.5369 

7 2.2373 0.8610 4.0983 

8 3.7580 0.9753 5.7333 


We see that even under mild assumptions on losses in the case of defaults the 
minimum margins for debtors of poor credit quality increase considerably. It is 
questionable that a debtor of poor credit quality, which in practice means low 
income, could afford a loan under these conditions even if Libor rates are very 
low. In the case of Table 17.7 this means that a debtor with a rating of 7 would have 
to pay Libor + 7.13% every year (Libor + 3.13% interest + 4% amortization) and 
in the case of a bullet loan in Table 17.8 would have to pay Libor + 4.10% which 
should be too much for a debtor with low income to buy a home worth a multiple of 
his annual salary. 

These examples illustrate how even a very simple model can increase the trans- 
parency of a business model undertaken by a bank. It forces the bank to declare its 
assumptions on defaults and recovery rates and translates them into minimum 
margins that have to be charged for a debtor of a certain credit quality. These 
assumptions can be verified using empirical data. Further, the consequences of 
small deviations from these assumptions can be analyzed. This can help to increase 
market discipline and prevent banks from charging margins that do not reflect the 
risks of a loan properly. 

Beyond this increase in transparency, the use of a model like (17.1) also brings 
the treatment of loan portfolios more in line with the treatment of other asset classes 
like bonds. Since both asset classes (loans and bonds) are valued using expected 
discounted cash flows, the changes in portfolio value of loans and instruments that 
are issued to fund the loans can be compared directly and mismatches either in 
value or in maturity can be detected easily together with a quantification of the 
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corresponding interest rate risks. Finally, we note that the RAROC valuation 
approach we presented is flexible enough to be generalized to illiquid equity 
investments, which again allows a bank to get a consistent view on different asset 
classes. This generalization is done in Engelmann and Kamga-Wafo (2010). 
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Chapter 18 
Risk Management of Loans with Embedded 
Options 


Bernd Engelmann 


18.1 Motivation 


In Chap. 17 it was outlined how the Basel II risk parameters can be used for the risk 
management of loans. It was shown in detail how to apply a risk-adjusted pricing 
formula for the calculation of a loan’s terms and of general loss provisions. In the 
framework of Chap. 17 a loan was characterized by a pre-defined structure of future 
interest rate and amortization payments only. In reality, loans are in general much 
more complex products. 

Often loans contain embedded options. The most popular example of an embed- 
ded option is a prepayment right. Here, a debtor has the right but not the obligation 
to bay back certain amounts of a loan in addition to the agreed amortization 
schedule. In Germany, often banks allow debtors to bay back 5 or 10% of the initial 
notional each year. Furthermore, by law every debtor has the right to pay back the 
outstanding notional of a loan after 10 years even if the agreed maturity of the loan 
is longer. In the language of option theory these amortization rights are of European 
or Bermudan style because it is only possible for a debtor to amortize at a discrete 
set of dates. In other countries, prepayment rights are even of American style, i.e. a 
debtor can pay back the outstanding notional at any time. Typically no penalty 
payment by a debtor is required when he pays back a part or all of the outstanding 
notional. Therefore, this right can be of considerable value for a debtor. 

In a floating rate loan, it is possible to define upper and lower bounds for the 
interest rate that has to be paid by a debtor. These bounds are called cap and floor. 
This loan is therefore a mixture of a fixed rate and a floating rate loan. Part of the 
risk of fluctuating interest rates has still to be taken by the debtor but this risk is 
capped. While the embedded interest rate cap is valuable for a debtor because it 
protects him from rising interest rates, the floor is a protection for a bank to ensure 
that the interest income cannot become arbitrarily low. Introducing an interest rate 
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Table 18.1 Minimum 


: h Rating grade Risk costs Opp. costs Minimum 
interest rates of the instalment (%) (%) rate (%) 
loan for each rating grade 1 0.0193 0.0469 5.9751 

2 0.0594 0.0888 6.0571 

3 0.1137 0.1023 6.1250 

4 0.3330 0.2864 6.5283 

5 1.0146 0.8258 7.7493 

6 2.4618 1.5762 9.9469 

7 6.1548 2.5831 14.6469 

8 10.8426 2.9259 19.6773 


floor in addition to a cap, therefore, makes the loan cheaper for a debtor. A cap can 
be very valuable for a debtor if future interest rate volatility is high. 

In addition to prepayment rights or caps and floors on floating interest rates, loan 
commitments are often part of a loan. A loan commitment is an option to draw 
additional amounts during a loan’s lifetime. It was already treated in Chaps. 10 and 
11 in the context of EAD modelling. In practice, a debtor pays interest and 
amortization payments for the part of a loan’s total notional that is drawn already 
and a commitment fee for the part that is not yet drawn. In times where banks face 
liquidity problems or are very risk-averse a loan commitment can be of consider- 
able value. 

In this chapter we treat prepayment rights in detail because they are the most 
common embedded options in loans. The mathematical framework presented below 
can be easily modified for caps and floors on variable interest rates. Loan commit- 
ments, however, are a different story. Here, in addition to assumptions on interest 
rates and a debtor’s credit quality, assumptions have to be made on the funding 
conditions of a bank to derive a pricing model for the loan commitment that results 
in a commitment fee reflecting all key risks a bank is facing. This is not part of this 
chapter. 

To derive the key drivers of loan prepayment, we start with a simple example. 
We use a similar loan that we have already used for illustration in Chap. 17. It is a 
15-year instalment loan with a fixed interest rate and an amortization rate of 4%. 
The loan’s collateral is worth 40% of the initial notional. The bank’s rating system 
is described by the transition matrix of Fig. 6.7 in Chap. 6 and discount factors are 
computed from a flat zero rate of 5% with annual compounding. The minimum 
interest rates that have to be charged by a bank for each rating grade using the 
framework of Chap. 17 are reported in Table 18.1. 

In the calculation of the minimum interest rates of Table 18.1 it was assumed, as 
in the examples of Chap. 17, that economic capital is computed according to the 
Basel II formulas for residential mortgage loans under the advanced IRB approach 
(BCBS 2004). The cost margin is 1% and the hurdle rate 10%.' Note that risk costs 
and opportunity costs of capital are exactly equal to the results in Table 18.2 of 


‘See Chap. 17 for a precise definition of these quantities. 
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Table 18.2 Minimum interest rate (in %) for the outstanding instalment loan after 10 years 
Rate shift (in %) Rating grade 


1 2 3 4 5 6 7 8 

—3.00 2.91 (P) 2.95(P) 2.97(P) 3.14(P) 3.72(P) 4.84(P) 748(P) 10.28 (C) 
~2.00 3.92 (P) 3.96(P) 3.97(P) 4.15(P) 4.73(P) 5.86(P)  8.53(C) 11.36 (C) 
~1.00 4.93 (P) 4.97(P) 4.98(P) 5.16(P) 5.75(P) 688(P) 9.58(C) 12.44 (C) 
0.00 5.94(P) 5.98(P) 5.99(P) 6.17(P) 6.76(P) 7.91(C) 10.63(C) 13.52 (C) 
+1.00 6.95(P) 6.99(P) 7.01(P) 7.18(P) 7.78(C) 8.94(C) 11.68(C) 14.61 (C) 
+2.00 7.97(C) 8.01(C) 8.02(C) 8.20(C) 8.80(C) 9.97(C) 12.74(C)_ 15.70 (C) 
+3.00 8.99 (C) 9.03(C) 9.04(C) 9.22(C) 9.83(C) 11.00(C) 13.80(C) 16.79(C) 


Chap. 17. This is no surprise for the opportunity cost of capital because they are 
computed in exactly the same way in both cases. Concerning the risk costs one 
might have expected a small difference because the loan in Chap. 17 was a floating 
interest rate loan while in this chapter we are using a fixed interest rate loan for 
illustration. However, since we have used an interest rate curve with a flat zero rate 
in both examples, all forward rates are equal to the base swap rate for the loan. 
Therefore, the risk costs have to be exactly equal in both cases. Under a realistic 
interest rate curve which contains some steepness and curvature small differences 
in the risk costs of a floating rate versus a fixed rate loan will be observed because 
the risk of losing an interest payment is valued differently depending on the 
variability in the forward rates. 

We assume that this 15-year loan contains a prepayment option after 10 years. 
To get an impression of the risk factors driving prepayment, we compute the 
minimum interest rate for the loan again after 10 years assuming that the loan 
was sold to a debtor in rating grade 5 initially. During the 10 years interest rates and 
the rating of the debtor can change. Concerning interest rate changes, we assume 
that only parallel shifts are possible, i.e. that the discount curve after 10 years is still 
represented by a flat forward curve. The results under different combinations of 
scenarios are summarized in Table 18.2. 

In the above table the minimum interest rates using the framework of Chap. 
17 for the outstanding 5-year instalment loan under the different scenarios for 
rating and interest rates changes are computed. If the minimum interest rate is 
below the initial minimum interest rate of the loan (7.7493%), the debtor will 
prepay his loan and refinance at the cheaper rate. The cases where the debtor 
will prepay are indicated by a “(P)”, the cases where he continues his loan are 
marked with a “(C)”. 

We see from the results that both the level of interest rates and the rating of a 
debtor at the prepayment date have an influence on the prepayment decision. If 
interest rates fall sharply but at the same time the debtor’s rating deteriorates it 
might be still reasonable to continue the loan besides the reduced interest rates. This 
is the case, for instance, for an interest rate reduction of 300 basis points but a 
simultaneous downgrade of the debtor to rating grade 8. On the other hand if 
interest rates rise prepayment might still be reasonable if at the same time the 
debtor’s rating has improved. If interest rates rise by 100 basis points and the 
debtor’s rating improves by at least one grade, prepayment is advantageous. 
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If nothing happens, i.e. interest rates and the debtor’s rating stay constant, the 
debtor will prepay. The reason is that for the outstanding 5-year loan the average 
default probability applied in the pricing formula is much lower than for the initial 
15-year loan. This results in a lower minimum margin. 

Overall, we conclude from the simple example that both interest rate and rating 
changes affect the prepayment decision. Therefore, we have to extend the pricing 
framework of Chap. 17 by stochastic interest rate and rating changes to include 
prepayment rights into the framework. Furthermore, it is known from practice that 
debtor do not act as rational as, for instance, interest rate derivatives or bond traders. 
They might not prepay even it is advantageous for them. This behaviour of debtors 
should also be included into a model. 

In the next section, we will explain a pricing framework for loans with prepay- 
ment rights. We will explain the necessary mathematical tools on an intuitive level 
without going too much into details. Some comments on the theory behind the 
pricing framework in the light of derivatives pricing and credit risk modelling 
are made and applications of the framework for the risk management of loans 
with embedded options are outlined. In Sect. 18.3 the pricing algorithm will be 
illustrated with an example. In the final section a short conclusion with possible 
extensions of the framework for the risk management of loan portfolios is given. 


18.2 Pricing Model 


We will derive a pricing model for loans with embedded options in three steps. In 
the first step, the modelling of rating transitions will be explained which results in a 
rating tree. In the second step, a term structure model for the evolution of interest 
rates will be introduced and we will try to explain its basic features in an intuitive 
way that is also understandable for readers who are not familiar with interest rate 
derivatives pricing theory. This will result in a tree model that is used for pricing 
interest rate dependent products. In the final step, we will combine the rating tree 
and the interest rate tree to a three-dimensional tree that can be used for pricing 
loans with prepayment rights. 


18.2.1 Modelling Rating Transitions 


Most of the mathematics behind the modelling of rating transitions was already 
developed in Chap. 6. Here, we will apply the results developed in Chap. 6 only. 
Suppose, we have a financial product that depends on the rating of a debtor at the 
times 0 = To, T),..., Tm. These could be the payment times of a loan or the dates 
where prepayment of all or parts of the outstanding notional of a loan is possible. At 
time zero the rating of a debtor is known. At times Ty, k # 0, we can compute the 
probabilities that a debtor will be in rating grade i under the assumption that he was 
in rating grade j at time 7,_,. These probabilities can be computed from the 
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| 

| 
To Tı Tə T; Time 
Fig. 18.1 Rating tree for a rating system with five rating grades 


transition matrix corresponding to the time period 7,—T,_,. This is illustrated with 
the rating tree in Fig. 18.1. 

In Fig. 18.1, the notation p;;(TılTo) denotes the probability that a debtor migrates 
from rating grade i in Tp to rating grade j in 7. In the rating tree it is assumed that 
the debtor has an initial rating of 3. Building the tree from Tı to Tọ is more 
complicated because the rating of the debtor in Tı is not known in Tp. Here we 
have to specify the transition probabilities for all rating grades (except for the 
default grade) separately. All these probabilities can be read from the transition 
matrix P(T>-T,) corresponding to the time period T,—T,. The transition probabil- 
ities for a migration from grade i in T; to grade j in T, can be read from the i-th row 
of the matrix P(T>—T,). For the calculation of this matrix we again refer to Chap. 6. 
In this way it is possible to describe every possible rating path a debtor can take in 
the tree and associate a probability to each path, the product of the transition 
probabilities in each time step. 

From a practical perspective, a rating tree is not an important tool if it is considered 
stand-alone. The reason is that there are hardly financial products in the market that 
depend solely on the rating of a debtor. Therefore, a rating tree will be almost always 
applied in combination with some other valuation framework. In the context of loan 
valuation this will be a short rate tree that will be introduced in the next section. 


18.2.2 Modelling Interest Rate Dynamics 


In this section we will give an introduction to short rate models, the simplest class of 
term structure models which is applied in banks. We start with a short overview of 
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interest rate products that are liquidly traded and that are needed for calibrating a 
term structure model. After that the Gaussian one-factor model is introduced and its 
mathematical properties are explained. We show how the parameters of the model 
are determined from market data and how tree algorithms for pricing interest rate 
dependent products are constructed in this model. 


18.2.2.1 Interest Rate Markets 


The most important interest rates in the interbank market are Libor rates and swap 
rates. A Libor rate determines the interest rate at which banks lend money to each 
other. These interest rates are available for maturities up to 12 months. A swap rate 
is the fixed interest rate in a swap, i.e. a contract where counterparties exchange a 
fixed rate of interest for a Libor rate. Swap contracts have maturities from | year up 
to 50 years. These interest rates are quoted on a regular basis in market data systems 
like Reuters or Bloomberg. As we have already explained in Chap. 17 these interest 
rate are needed to compute the interbank discount curve that can be used as a 
reference curve for loan pricing. 

On these two interest rates call and put options are traded. In the case of Libor 
rates these options are called caps and floors. Options on a swap rate are called 
swaptions. The market conventions for these products are a bit different from call 
and put options in equity markets. An example of a cap is illustrated in Fig. 18.2. 

A cap is not a single option but a series of call options. In the example of a 7Y 
cap ona 12M Libor rate of Fig. 18.2 the cap consists of six call options with a strike 
price of 4%. Each of these options is called a caplet. Each caplet has an exercise 
date and a payment date. At the exercise date the option holder can decide if he 
wishes to exercise the option what he will do if the option payoff is positive, i.e. if 


2nd payment 4th payment 6th payment 
if exercised if exercised if exercised 
lst payment 3rd payment 5th payment 
if exercised if exercised if exercised 
| | | | | > 
0 1 2 3 4 5 6 7 Time 
f L p l 
Ist exercise if 3rd exercise if 5th exercise if 
12M Libor > 4% | 12M Libor > 4% | 12M Libor > 4% 


2nd exercise if 4th exercise if 6th exercise if 
12M Libor > 4% 12M Libor > 4% 12M Libor > 4% 


Fig. 18.2 A 7Y cap on a 12M Libor rate with a strike price of 4% 
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exercise if , payments of the floating leg 
swap rate < 4% if the swaption was exercised 


Fig. 18.3 A 5Y receiver swaption on a 10Y swap with a strike price of 4% 


the Libor rate of this day is greater than 4%. The date where the payment is done if 
the option was exercised is 12 months later. The time gap between exercise and 
payment corresponds exactly to the tenor of the interest rate. This is different from 
equity options where payment is done immediately after exercising an option. In 
general notation, the payoff of a caplet with maturity T is 


max(0,f — K) -q (18.1) 


where fis a Libor rate that is fixed in T, K is the strike price, and t is the tenor of the 
Libor rate. The payment time of the payoff, if it is positive, is in T + T. 

An example of a swaption is given in Fig. 18.3. Here, a 5Y receiver swaption on 
a 10Y swap with a strike price of 4% is illustrated where we stick to the European 
convention of a swap paying the fixed rate annually and the floating rate semi- 
annually. In this contract the holder has the right to enter in 5 years into a receiver 
swap with a maturity of 10 years. The terminus “receiver swap” means that the 
option holder will receive fixed payments in this swap contract.” Therefore, he will 
exercise the option if the market swap rate at the exercise date is below the strike 
price of the option contract. 

The payoff of a receiver swaption cannot be expressed by a simple formula like 
the payoff of a caplet because the profit of exercising a receiver swaption is realized 
at every payment date of the fixed rate in the swap. Therefore, this payoff at the 
exercise date must be written as the present value of all these profits 


>» t; - df (T;) - max (0,K — s5) (18.2) 
i=1 
where n is the number of fixed rate payments in the swap, 7),..., T„ the payment 


times, df(T) the discount factor corresponding to time T, t; the year fraction of the 


*If the option holder pays the fixed rate in the swap the contract is called a payer swaption. 
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i-th fixed rate period in the underlying swap, K the strike price, and s the swap rate 
that is fixed at the exercise date of the swaption. 

For both option contracts, the option premium is determined by the Black 76 
formula, a market standard formula for call and put options in many markets (see 
Black 1976). It is assumed that at the exercise date the underlying interest rate, the 
Libor rate in the case of a cap and the swap rate in the case of a swaption, is 
distributed log-normally, i.e. that the logarithm of the interest rate is distributed 
normally. This distribution is given by 


In (yr) ~ N(In (yo) — 0.5 - 0?T, o VT) 


where yr is the value of the interest rate at time T, yọ is its current forward value, and 
o its volatility. Note that the standard deviation is proportional to the square root of 
T, i.e. the uncertainty of the future value of the interest rate y is increasing with time. 
Under this assumption a simple formula can be derived for the option premium of 
both contracts by calculating the discounted expected value of each contract’s 
payoff. 

In the case of the caplet, we get for the premium V,,/.; the expression 


Veapier = Af (T + 1) -t-(f -N(d1) — K - N(d2)) 
_ log(f/K) + 0.5-0? -T 

7 o- VT 

dy = dı — o - VT 


dı 


(18.3) 


where fis the forward of the underlying Libor rate, K the caplet’s strike price, t the 
Libor’s tenor, T the caplet’s expiry, df(T + T) the discount factor corresponding to 
the caplet’s payment time, and N(.) is the cumulative distribution function of the 
normal distribution. A similar formula exists for the floorlet. 

For the premium of a receiver swaption V,eceiver We get 


V receiver = M - (K g N(—d2) =s: N(—d,)) 
M= Sa a df (Ti) 
i=1 


_ log(s/K) + 0.5 -0° -T 
7 o- VT 
d =d,-—o-VT 


(18.4) 
dı 


The notation used in this formula was already explained above. For payer 
swaptions an analogous formula exists. 

In practice, these options are traded liquidly and prices are determined by supply 
and demand. The formulas (18.3) and (18.4) are used as quotation tools. For each 
option the volatilities are quoted at which traders are willing to buy or sell an option 
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and the formulas are needed to convert volatilities into option prices. The reason is 
that for traders it is easier to compare option prices for different maturities and 
different strike prices in terms of implied volatilities. The volatilities that reflect 
current market prices are quoted in market data systems like Reuters or Bloomberg. 


18.2.2.2 The Gl Model 


The simplest model class for modelling the term structure of interest rates are short 
rate models. The short rate is an artificial mathematical quantity that cannot be 
observed in the market. It describes the interest rate that is valid over the next 
infinitesimal time period. Illustratively, one can think of the short rate as an 
approximation of the overnight rate. Therefore, short rate models describe the 
dynamics and the future development of the overnight rate. 

If the distribution of the short rate at future times is normal, the corresponding 
short rate model is called Gaussian. In its simplest version the short rate is driven by 
one stochastic factor and is called a Gaussian one-factor short rate model, the Gl 
model. Mathematically, the model is described by the dynamics 


r(t) = x(t) + (2), 
dx = —K-x(t)-dt + c(i) aW, (18.5) 
x(0) = 0, 


where r is the short rate, x the stochastic factor, O a deterministic time-dependent 
function, « a positive constant, o the (possibly time-dependent) volatility of x, and 
W a Wiener process.” 

For readers who are not familiar with continuous-time stochastic calculus we 
illustrate the short rate dynamics (18.5) by its discretized version. Starting from 
x = 0 at time ¢ = 0 a path of the short rate can be simulated using uniform time 
steps At by 


r(iAt) = x(iAt) + O(iAD), 


(18.6) 
x(iAt) = x((i— 1)At) — «+ x((i— 1)At) - At +. olli — 1) At) - VAt- Z, 
where Z a is random number that is normally distributed. The short rate r at each 
time point is written as the sum of the stochastic factor x and the function 0. The 
stochastic factor is driven by two components. The first component is deterministic. 
The second component is stochastic and models the randomness of x. 


3The G1 model is a mathematical transformation of the Hull and White (1990) model. The 
transformed model is more convenient from a mathematical perspective but contains exactly the 
same economic content. 
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To simulate a path of the short rate one has to simulate normally distributed 
random numbers and apply (18.6) iteratively. The parameter o is a measure for 
the uncertainty of future values of the short rate. The larger o the more the third 
term in (18.6) can fluctuate around zero. Economically one would expect from 
overnight rates that they are limited to a certain range of numbers. This is in 
contrast to a stock where an unlimited growth over time is in principal possible. If 
the overnight rate has a rather high value one would expect it to fall with high 
probability in the future. The opposite is true if the overnight rate is on a historical 
low value. This property of interest rates is called mean reversion. In (18.6) the 
mean reversion property is ensured by the second term involving x. Whenever x is 
positive (negative) the second term becomes negative (positive) and generates a 
downward (upward) drift. 

To price interest rate products, one has to be able to simulate future Libor or 
swap rates. Both interest rates are computed from the future discount curve. 
Therefore, it suffices to simulate future discount curves. This can be done in a 
short rate model by simulating overnight rates and multiplying the resulting 
overnight discount factors to get a discount factor corresponding to a larger 
time interval for a specific scenario. Taking expectations over a large number of 
scenarios results in a future discount curve. To be mathematically more precise, a 
future discount curve at time ¢ can be computed conditional on the short rate r(t) 
at time ¢ from 


T 


aro =r) = Efe- | roaro =] 8.7) 


t 


To illustrate the basic principle of product pricing using a short rate model we 
take the example of a caplet. To price a caplet in the short rate model (18.5) the 
following steps have to be carried out: 


1. Simulate a short rate path from time t = 0 to time T using (18.6). This path ends 
at time T in the value r(7). 

2. Simulate many short rate paths from time T to time T + t using (18.6). All paths 
start in r(T). 

3. From all the short rate paths in step 2 compute the discount curve using (18.7) 
where the expectation is replaced by an arithmetic average. 

4. From the discount curve in step 3 compute the realized Libor rate f 


f= (1/df(T + t|r(T) =r) —1)/t 
5. Compute the discounted payoff for this scenario by 
df (T + t|r(T) =r)-t-max(0,f — K) (18.8) 


6. Compute the discount factor corresponding to the path in step | and multiply it 
with the result in (18.8) to discount the payoff back to time t = 0. 
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7. Repeat steps 1-6 many times and compute the option price as the average over 
all simulated values. 


This procedure looks very awkward because nested Monte-Carlo simulations 
have to be applied to compute the price of a rather simple instrument. The above 
procedure would never be implemented in practice. It should just serve as an 
illustration how a short rate model in principle can be used to determine the price 
of a financial instrument. 

One of the reasons for the popularity of the Gaussian one-factor model is its 
analytical tractability. For instance, it is not necessary to compute (18.7) by Monte- 
Carlo simulation because an analytical expression exists for this expectation. In the 
next two subsections we will explain the missing parts for using this model in 
practice, how to calibrate the model parameters and how to implement an efficient 
pricing algorithm after the model is calibrated. 

We have explained the short rate model on a rather intuitive level. For instance, 
we have computed product prices as expectations over simulated scenarios. 
Although this procedure seems plausible it is not clear that this leads to reasonable 
prices. In derivatives pricing theory it is shown that the absence of arbitrage in 
financial markets, i.e. the absence of trading possibilities that deliver risk free 
profits without capital investments, imply the existence of a probability measure, 
the risk-neutral measure, under which meaningful prices can indeed be computed as 
expectations. For details the reader is referred to the books of Hull (2008), Joshi 
(2003) and Shreve (2004). 

Finally, we comment on the economic interpretation of the model. On a first 
glance it may seem strange to model the dynamics of the full term structure by the 
dynamics of the overnight rate. From empirical analyses of term structure move- 
ments over time it is known that the term structure dynamics can be described with 
very good accuracy by three components, parallel shifts, changes in the steepness, 
and changes in the curvature of the term structure (Litterman and Scheinkman 
1991). The most important component is the parallel movement. Basically, a one- 
factor model describes the changes in the level of interest rates. If these level 
changes are modelled by a short-term, medium-term, or long-term interest rate 
does not play a role. In this sense the modelling of the term structure by a very short 
term rate can be justified. Of course, to model more general movements of the term 
structure more factors are needed. A good reference for modern interest rate 
modelling approaches is Brigo and Mercurio (2006). 


18.2.2.3 Calibration of the G1 Model 


In the last subsection we have explained the G1 model and outlined its economic 
interpretation. To use the model its parameters 0(¢), K, and o(f) have to be specified 
from market data. If we apply (18.7) with t = 0 and r(0) approximately equal to the 
current overnight rate, the left hand side of (18.7) is equal to the current discount 
curve. This results in a condition for the parameter 0(¢). In fact, it is possible to derive 
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a formula for 0(f) that expresses 0(f) in terms of the current discount curve and the 
(still undetermined) parameters « and o(f) 


t 


A(t) =f) +e" [enoa 
0 (18.9) 


so = 29 jo 


The function f(t) is the instantaneous forward rate, i.e. the forward rate that is 
valid from time f over an infinitesimal time period. It can be computed from the 
current discount curve as shown in (18.9). This choice of 0(f) ensures that zero bond 
prices are correct in the G1 model and equal to current discount factors. 

It remains to determine the model parameters « and o. These parameters are 
calibrated from the prices of liquid options, i.e. caps or swaptions. The basic idea is 
to use the model parameters that result in model prices matching given market 
prices as close as possible. This leads to an optimization problem 


K 
min 5 (Vi modei = Vie (18.10) 


i=l 


where K is the number of market instruments that are used for calibration, V; modet 18 
the model price of the i-th instrument while V; marker 18 its market price. 

To carry out the calibration efficiently we need pricing formulas for caplets and 
swaptions in the G1 model. To derive the necessary pricing formulas we will show 
as a first step that in the G1 model both the pricing of caps and the pricing of 
swaptions can be reduced to the pricing of options on zero bonds. 

We start with a caplet with maturity T and payment time T + t. The underlying 
of the caplet is a Libor rate f(T, t) that is fixed in time T and has a tenor t. By P(¢,T) 
we denote the price of a zero bond with maturity T at time t. Note that at time t = 0 
it holds df(T) = P(0,T). For the caplet we write its price as the expected value of its 
discounted payoff. The expectation is taken over possible paths of the short rate 
which are determined by its dynamics. 


caplet = E |exp| — | r(s)ds | -t-(f(T,t) —K)* 


= E | exp - [roas -P(T,T +1) -1 (f(T,1)— K)* 
0 
T i T 
= E |exp| — | r(s)ds | -P(T,T+7)- K-t 
| (aa ) 
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= E | exp = [roas -(1— (1 +K- 1)-P(T,T + 1))" 
0 
T i F 
=(1+K-1)-Eļ|exp| — |r(s)ds | - | ———~ - P(T,T +1) 
| o ) 


= put on zero bond 


The notation (.)* is an abbreviation for max(0, .). By expressing the Libor rate in 
time T + t by the corresponding zero bond price we end up with a formula for a put 
option on a zero bond. If we know a formula for the price of a put option on a zero 
bond in the G1 model we can use the above relation to compute the price of a caplet. 

To derive a similar relationship for swaptions is more complicated. We start with 
the payoff of a swaption with maturity Tọ. We replace the discount factors in the 
payoff formula (18.2) by zero bond prices P(T,7;) because we have to consider the 
value of the payoff at maturity Ty to compute the swaption’s price. 


payoff = max(0,K — s) Ser (To, Ti) 


i=1 


= max (ox . > ti- P(To, T) —s- 3 ti : P(To, T; ) 
i=l 


= max (o 5 t °K -P(To,T;) — (1 — mra) 


i=1 


= max (0. Yee (To, T; )-1), 


(18.11) 


In the derivation we have used the formula for the forward swap rate 


1 — P(To, Ta) 
Eu P(To,T, i) 


SS 


and have introduced the coupons c; 


eu (EG ifi<n—1 
| K-etm1+1, ifi=n-1 


We have shown that a swaption is equivalent to an option on a coupon bond with 
a strike price of 1. This option will be exercised if the condition 
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Ye (To, T; 2 >1 


is fulfilled. 
To proceed with deriving a pricing formula for the swaption we need a pricing 
formula for zero bonds. They can be computed from expression (18.7) 


P(t, TIr( = r) = A(t,T) -exp(—r- B(t,T)) (18.12) 


The functions A(t,T) and B(t,T) can be expressed in terms of the model para- 
meters 


t 


1 i : 
-exp| B(t,T) - f(t) — = BGT): ert [2™0%(s)ay , 
0 


df (T) 
df(t) 


A(t,T) = 


1 
B(,T) =: (1 — gre), 


We see that the zero bond price in (18.12) is monotonous is the short rate r. 
Therefore it is possible to find a unique value 7” that fulfils the condition 


ae (To, T;\r(To) = r*) = 1. (18.13) 


The value 7“ of the short rate has to be determined by a zero-search algorithm 
like Newton’s method or a bisection algorithm (Press et al. 1992). 
Combining relation (18.13) with (18.11) leads to 


payoff = max (o 5 ci- P(To, Ti) = ") 


i=l 
= max (0. 3 ci- P(To, Ti) — ci - P(To, Tilr(To) = r) 


= D ci : max(0, P(To, T;) — P(To, Ti|r(To) = r*))- 


i=l 


We see that similar to the price of a caplet also the price of a swaption can be 
written in terms of prices of options on zero bonds because we have rewritten the 
payoff of a swaption as the payoff of a portfolio of call options on a zero bond with 
strike prices P(To,T;lr(To) = r^). If we know a pricing formula for options on zero 
bonds in the G1 model, the prices of caplets and swaptions can be calculated easily. 

The price of an option on a zero bond can be computed as an expectation over the 
payoff using the formula for a zero bond price (18.12) in connection with the 
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distribution of the short rate r at the option’s maturity. This leads to an easy to 
evaluate formula for the price Vcan of a call option with maturity T on a zero bond 
with maturity T + T 


Vea = df (T + 1) -N(h) — df (T) -N(h— op), 


o1 df(T+1)\ øp 
aa oe (KR) Tg” 


(18.14) 


The price of a put option on a zero bond can be computed from the put-call parity 
Veut + df (T + T) = Veall +K. df (T) 


This gives us the final ingredient to carry out the calibration of the G1 model. 

It remains to decide which instruments (caps or swaptions) should be used for 
calibration. It depends on the product for which the model is needed. As a general 
principle, one should use instruments for calibration that are similar to the product 
that should be priced. If a loan that contains prepayment rights should be priced, 
swaptions are more appropriate for calibration because a prepayment right is 
basically an embedded swaption. For a loan with embedded caps and floors on 
a floating interest rate, interest rate caps and floors are more suitable for the 
calibration of the G1 model. 


18.2.2.4 Tree Implementation of the G1 Model 


In this subsection we present an efficient implementation of the G1 model by a 
trinomial tree. A trinomial tree is a discrete method to price products in the G1 
model. In the Monte-Carlo simulation we have used to illustrate the G1 model in 
Sect. 18.2.2.1 we already have done a time discretization but the short rate could still 
attain any real value in each point of time. In a trinomial tree the set of admissible 
values for the short rate is restricted to a finite grid of points in each time step. This 
is done in a structured way to construct an efficient algorithm for the calculation 
of product prices. 

An example of a trinomial tree is shown in Fig. 18.4. We denote each node with 
rij which is the j-th short rate grid point at the i-th time grid point. Every node r;,; in 
the tree has exactly three succeeding nodes rj41 j, Fi+1 j+1, and riy j+2. These nodes 
are built in a way that the tree is recombining. This ensures that the number of nodes 
does not grow exponentially with the number of time steps. Furthermore, associated 
with every node are three probabilities qa,ij, Gm,jj, and qu ij that are needed to 
compute product prices as discounted expectations. 
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Fig. 18.4 Illustration of a trinomial tree for the G1 model 


Both the values r;; of the nodes and the probabilities qa; j, mij. aNd qu,i j are 
determined by a construction process. The three probabilities are computed in a 
way that ensures that the local expected values and variances of the stochastic factor 
x are identical to the corresponding values of the continuous-time process (18.5). 
The values of the short rate are chosen to ensure that zero bonds with maturities 
identical to the time grid points of the trinomial tree are priced correctly. 

To be more specific, to construct the tree we have to start with defining a time 
grid 0 = fo, ty,..., 4. It has to be ensured that all dates that are relevant for product 
pricing like coupon payments and exercise dates are included in this grid. Further, 
we denote with x;; the j-th point of the factor grid at time ¢;. From the dynamics of 
the stochastic factor x in (18.5) we get 


mij = E|x(ti+1) x(t) = xij] = Xij: exp(—K . (tint = ti)) 
o? (ti) (18.15) 
tH) — exp(-2 -x (tat 4) 


v2. = Var [x(ti+1) x(t) = xij] x 


iJ 


where for the calculation of the variance it was assumed that the volatility © is 
locally constant. 

The standard deviation v;; of (18.15) is used to construct the grid for x. Intui- 
tively, it is clear that the step size Ax of the grid should be proportional to the 
standard deviation. A standard choice is 


Ax(ti41) = max vij V3. (18.16) 
J 


With this choice for the step size the x-grid is constructed as k-Ax. The values for 
k that are needed to construct the grid at time ¢;,, are defined from the mean values 
in (18.15) to ensure that the grid covers the value range that is attained with high 
probability 
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k = round | ——!— |. 18.17 
“i (aay) 


With this definition of the middle node of the three succeeding nodes at time f;+1 
of each node at time f;, we have all ingredients to compute the tree probabilities. 
This is done by matching the moments of the continuous and the discrete distribu- 
tions of the short rate by solving the set of equations 


Mij = Quij ` Xi+1,k+1 T YUmij * Xi+1,k T qd,ij' Xi4+1,k-1, 


2 — rr Ar i 2 se 2 
Vij = quij a 1k+1 T Imig * Xizi k T 4ad,ij Xe 


2 
= (duij *Xi+1k+1 F Umi j * Xi+1,k + qdij  Xi41,e-1) ) 


1 = quij imij + liir 


It can be shown that the choice of the step size in the x-grid (18.16) leads indeed 
to probabilities, i.e. that the quantities qa,ij, 4m,i j, and qu; j are positive (Brigo and 
Mercurio 2006). 

The values of the short rate tree r; ; can be computed by adding 0(t;) to the x-tree 
which is computed by (18.9). Since @ in (18.9) is derived from a continuous-time 
process there will be a small discretization bias when zero bonds are priced with the 
tree, i.e. the discount curve will not be matched exactly by the tree. To fit discount 
factors exactly one could alternatively compute the correction term 0(t;) instead of 
using (18.9) by an additional numerical calibration in the tree. The details of this 
calculation can be found in Brigo and Mercurio (2006). 

After the construction of the tree is finished it can be used for product pricing. 
Product prices are computed by iterative expectations. The discretized product 
value V;; is initialized in time ¢,. Depending on the specific product this can be 
done by initializing V,; by the product’s payoff or by the value of a coupon 
payment. The preceding values of V are then computed iteratively as discounted 
expectations 


Vij = exp(—ri, (tig — ti)) i (uij Vier asi + qm,ij' Viti k + daij Visie-1) 


where k was defined by (18.17). At every time point t; where either a coupon is paid 
or a counterparty of the product has an exercise right, the value of V;; has to be 
modified appropriately. We will see this in detail in the next section when a pricing 
algorithm for a loan with prepayment rights is developed. 

Finally, we remark that the trinomial tree is a popular and intuitive but not the 
most efficient way of pricing interest rate products. It can be shown that pricing 
a product in the G1 model is equivalent to solving a partial differential equation 
that is determined from the short rate dynamics (18.5). For partial differential 
equations solution algorithms exist that deliver a higher accuracy for less 
computational effort than the trinomial tree. Details can be found in Randall 
and Tavella (2000). 
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18.2.3 A General Loan Pricing Framework 


In this section we combine the rating tree of Sect. 18.2.1 and the interest rate tree of 
Sect. 18.2.2 to a pricing algorithm for loans with embedded options. By assuming 
that interest rate changes are independent from rating changes both models can be 
easily combined to a three-dimensional tree. This model was already suggested by 
Schonbucher (2003) in a different context. 

The resulting three-dimensional tree is illustrated in Fig. 18.5. In this example 
the tree is built for a rating system with six rating grades where the sixth grade is the 
default grade. From every node it is possible to reach eighteen succeeding nodes, 
six possible rating changes times three possible changes in the short rate. Because 
of the independence assumption of rating changes and interest rate changes the tree 
probabilities can be easily computed by multiplying the probabilities of the interest 
rate tree with the probabilities of the rating tree. The pricing of a loan in the tree is 
carried out analogously to the pricing in the two-dimensional trees by computing 
discounted expectations iteratively starting from the most-right nodes. 

We explain the pricing of loans with prepayment rights in detail. To model 
prepayment some assumptions on the behaviour of debtors and the conditions of 
refinancing a loan have to be made: 


e We assume that a debtor needs the money that was lent by a bank until the loan’s 
maturity. If he is able to get a cheaper loan over the remaining maturity on a 
prepayment date he will prepay with a probability Pex- 

e Ifa debtor prepays and enters a new loan the opportunity costs of capital and the 
internal costs for the new loan are the same as for the old loan (cf. Chap. 17 for 
an explanation of these cost components). 


> 


Rating Grade 


O O O O 90 


(0) 


Time 


Fig. 18.5 Illustration of the three-dimensional tree that is used for pricing loans with embedded 
options 
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e Ifa debtor prepays and enters a new loan, this loan will not have any prepayment 
rights. 
e All banks have the same opinions on default probabilities and recovery rates. 


The exercise probability pex is introduced to model the irrational behaviour of 
retail costumers. They do not act perfectly rational like interest rate derivatives or 
bond traders and might not prepay even if it is advantageous for them. The 
probability pex gives the probability that a debtor will prepay when the conditions 
are in his favour. 

We explain the steps that are necessary to implement a pricing algorithm for a 
fixed-rate bullet loan with prepayment rights in this model in detail. We assume that 
atime grid 0 = fo, ty,. . ., t is constructed that contains all important time points, the 
payment times of coupons and the times where prepayment is possible. Further, we 
assume that the tree of Fig. 18.5 is constructed using the steps that were explained in 
Sects 18.2.1 and 18.2.2. We use the notation N for the loan’s notional, z is the loan’s 
fixed interest rate, T,,..., Tm are the interest rate payment times, c,,, is the sum of 
the opportunity cost of capital and the internal cost margin, and q; is the year 
fraction of the i-th interest rate period. We compute the price V(u,r; ;,t;) of a fixed- 
rate bullet loan with prepayment rights depending on the rating u, the short rate r; ; 
and time ¢; using the algorithm: 


1. At t; Initialize V(u,r;j,ti) and V..(u,r7 jti) with N. 
2. At tr Add z-t,,-N to V(u,r7j.t1) and (Z — Cror)-Tm-N to Vex(Usri oti) 


3. At t1: Compute V(u,r;_; jtı-1) from the values of V at the succeeding nodes: 


n—1 
V (u, rij t1) = X Pug(ti|t-1) ; V (g, rij t1) 
g=l 


gy Pun(ti|ti-1) i R(ti-1) N 
V (g, Tij t1) = eii (itm) , (quiy . V(g, reed, ti) 
+ qma- ` V (8, Fik, ti) + qag-15 ° V (8, r1-1,t1)) 


4. Repeat step 3 for Vex. 

5. Repeat steps 3 and 4 until time t- = Tm—1 is reached. 

6. At Tm-1: Add Z Tm; N to Vu,rzj,t2) and (Z — Crot) Tm—1 N to Vex(U, r- jyt). 
7. At Tm-1: IfT,,-1 is a prepayment time replace V(u,r- j,t-) by 


Pex N + (1 — Pex) f V(u, lzj, tz) 


if the condition V,,(u,r,j,f-) > N is fulfilled. 


8. Repeat steps 3-7 until t = 0 is reached. 


To include prepayment rights into loan structures with amortization schedules like 
instalment loans or annuity loans the amortization payments have to be added to the 
interest rate payments in the above algorithm. It is also possible (but a bit more 
complicated) to extend the pricing algorithm to loans with a floating interest rate. 
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The auxiliary variable V,, is used only to determine if prepayment is advanta- 
geous. It is computed from the interest margin excluding all cost components 
except for the risk costs. If the loan value at the prepayment date is “fair”, i.e. if 
the loan’s margin is exactly equal to the then prevailing minimum margin under the 
assumptions on costs used in the pricing algorithm, then the condition Vey = N 
would be fulfilled. Therefore, prepayment is advantageous if the loan under its 
current terms is too expensive under the actual market rates which is reflected in the 
condition Vex > N. 

We will illustrate this pricing algorithm with a numerical example in the next 
section. This section will be concluded with comments on the theoretical properties 
of this pricing model. What we have done in the tree algorithm is mixing risk- 
neutral probabilities of the interest rate tree that are implied from the market with 
real-world probabilities of the rating tree that are based on statistical information. 
The underlying theory of derivatives pricing models implies a trading strategy that 
allows the perfect hedging of interest rate risk with basic instruments in the interest 
rate market. The combination with statistical probabilities results in a model of an 
incomplete market, i.e. a model that contains risks that are not tradable and cannot 
be hedged. 

This has consequences for the risk management of prepayment rights. In principle, 
prepayment rights can be hedged by receiver swaptions. The number of receiver 
swaptions that are needed for the hedge (the hedge ratio) is determined by the pricing 
model. However, if realized default rates are different from default probabilities, 
these hedge ratios turn out to be wrong. In this case the hedge might lead to 
unexpected losses. Analyzing this model in some detail shows that the risk of 
unexpected losses on the hedge of prepayment rights does not lead to an increase 
of economic capital for a loan portfolio because for typical loan portfolios the level of 
economic capital is dominated by default risks. That means that unexpected losses in 
hedges of payment rights are already covered by the economic capital that is needed 
as a buffer for unexpected losses due to defaults. The details of these analyses are 
worked out in Engelmann (2010). 


18.3 Numerical Example 


In this section, we will present a numerical example using real market data. We use 
the discount curve that is presented in Table 18.3 and the swaption volatility matrix 
of Table 18.4. 

As an example we consider a bullet loan with a maturity of 15 years. The loan 
has a fixed interest rate and a prepayment right after 10 years. The debtor has the 
right to fully pay back the loan after 10 years without penalty. The notional of 
the loan is 1 million. The loan is secured with collateral worth 400,000. As in the 
examples of Chap. 17 the default probabilities are computed from the transition 
matrix of Fig. 6.7 in Chap. 6. 
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Table 18.3 Discount curve used in the example for pricing a loan with prepayment rights 


Maturity (years) Discount factor Maturity (years) Discount factor 
0.0027 0.999978 2.0000 0.955710 
0.0833 0.999214 3.0000 0.932863 
0.1667 0.997850 4.0000 0.900632 
0.2500 0.996469 5.0000 0.866350 
0.3333 0.995137 6.0000 0.830990 
0.4167 0.993367 7.0000 0.796393 
0.5000 0.991578 8.0000 0.762382 
0.5833 0.989688 9.0000 0.727801 
0.6667 0.987618 10.0000 0.694570 
0.7500 0.985465 12.0000 0.631269 
0.8333 0.983284 15.0000 0.542595 
0.9167 0.981037 20.0000 0.434336 
1.0000 0.978903 30.0000 0.318877 


Table 18.4 Swaption volatilities used in the example for pricing a loan with prepayment rights 
(First column: swaption expiry, First row: tenor of the underlying swap) 


1 2 3 4 5 6 7 8 9 10 


0.08 0.475 0.376 0.338 0.319 0.311 0.305 0.297 0.291 0.286 0.281 
0.17 0.486 0.387 0.350 0.332 0.316 0.305 0.296 0.291 0.286 0.281 
0.25 0.484 0.402 0.360 0.332 0.314 0.301 0.293 0.287 0.283 0.279 
0.50 0.453 0.367 0.326 0.301 0.284 0.275 0.271 0.268 0.268 0.265 
0.75 0.422 0.340 0.302 0.279 0.265 0.257 0.253 0.251 0.250 0.250 
1 0.392 0.316 0.283 0.263 0.249 0.241 0.237 0.236 0.235 0.234 
1.5 0.312 0.269 0.248 0.233 0.224 0.218 0.216 0.215 0.215 0.215 
2 0.261 0.238 0.224 0.214 0.206 0.203 0.201 0.201 0.201 0.202 


3 0.210 0.198 0.190 0.186 0.182 0.182 0.182 0.182 0.182 0.182 
4 0.180 0.173 0.170 0.168 0.167 0.167 0.167 0.167 0.167 0.167 
5 0.162 0.158 0.156 0.156 0.157 0.156 0.155 0.155 0.155 0.156 
7 0.144 0.142 0.141 0.140 0.140 0.139 0.139 0.140 0.141 0.142 


10 0.131 0.130 0.129 0.129 0.129 0.130 0.131 0.132 0.134 0.136 
15 0.130 0.131 0.134 0.137 0.141 0.144 0.148 0.152 0.156 0.159 
20 0.165 0.169 0.174 0.179 0.183 0.186 0.189 0.191 0.193 0.194 


To measure the effect of rating migration on the pricing, we carry out the 
algorithm of Sect. 18.2.3 both with the full transition matrix and with the term- 
structures of default probabilities that were computed in Table 17.1 of Chap. 17. In 
the latter case the algorithm of Sect. 18.2.3 is applied with two rating grades only, 
the non-default grade and the default grade. The exercise probability pex in this 
algorithm is set to 100%. 

There are two ways to extend the RAROC pricing framework of Chap. 17 to 
loans with amortization rights. One possibility is increasing the risk costs by 
including prepayment risk. This is done by computing the risk costs from the 
condition V = N. This condition was also applied in the case without amortization 
rights but leads to an increased value of the risk costs when amortization rights are 
included. Alternatively, instead of increasing the interest margin a bank could 
charge the option premium by an upfront payment. In this case the risk costs are 
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Table 18.5 Risk costs (in %) 


; Rating Risk costs (no Risk costs (PD Risk costs 
for the 15 years loan with and 


ith ich grade prepay. right) term structure) (migration matrix) 
eee ie right 1 0.038 0.193 0.194 
2 0.098 0.247 0.251 
3 0.192 0.331 0.339 
4 0.493 0.616 0.637 
5 1.286 1.397 1.440 
6 2.759 2.890 2.964 
7 6.084 6.432 6.543 
8 10.244 10.986 11.057 


computed in the same way as for the otherwise identical loan without amortization 
rights. The option premium is determined by pricing the loan using the margin of 
formula (17.9) of Chap. 17 and computing the difference to the initial notional. 

We start with calibrating the G1 model. Since the loan has one prepayment right 
only, a reasonable calibration strategy is to calibrate the model to the 1OY swaption 
into a 5Y swap which can be viewed as the underlying option of the loan. Since two 
parameters cannot be calibrated from one instrument, we have calibrated « from the 
full swaption matrix, i.e. we have solved the minimization problem (18.10) with 
time-independent o to determine «K. After that, we modify o to match the price of 
the calibration instrument. We find x = 0.0182 and o = 0.0090. 

In the first example we assume that the premium for the prepayment option results 
in an increased margin. We compute risk costs for the loan without prepayment right 
to get the reference rate reflecting the margin for expected loss only. After that we 
compute the risk costs including the prepayment right for the two cases explained 
above, using a term structure of default probabilities only versus using the full 
transition matrix. The results are presented in Table 18.5. In the second example we 
assume that the loan is sold with the risk margin corresponding to an otherwise 
identical loan without prepayment right and that the option premium is paid upfront 
by the debtor. The resulting option premia are reported in Table 18.6. 

From Table 18.5 we see that for the good rating grades the largest proportion of 
the risk costs corresponds to the prepayment right. For the poor rating grades it is 
vice versa. Here the risk costs are mainly driven by default risk. Further, we see that 
migration does not have an effect for the good rating grades. These debtors only face 
the risk of downgrades which would make their prepayment option less valuable. 
For this reason it does not make a difference if the prepayment right is priced with a 
term structure of default probabilities only or with the full transition matrix. The 
situation is different for debtors with poor rating grades. They have the change of 
upgrades which would increase the value of their prepayment option considerably. 
This chance of upgrades leads to a higher risk margin if the pricing is done with the 
migration matrix compared to the term structure of default probabilities. 


“The calibration of x from market data might be rather unstable, i.e. the value of x is fluctuating 
strongly with changing market data. For this reason, this parameter is alternatively often estimated 
empirically from historical data. 
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Table 18.6 Prepayment 


: : Rating grade Option premium Option premium 
option premia r 4 (PD term structure) (migration matrix) 
cies aa is charge i 16.917 17,093 

2 16,349 16,776 
3 15,278 16,169 
4 13,309 15,335 
5 11,112 14,938 
6 10,973 16,529 
7 18,998 24,662 
8 28,592 30,852 


The picture is similar in Table 18.6 where the option premium is charged upfront 
instead of by an increased margin. We see that the effect of rating migration is small 
for good rating grades and considerable for the poor rating grades.” We see that 
option premia are not monotonous in the rating grade. Furthermore, we find that the 
premium increase under the inclusion of rating migration is also not monotonous in 
the rating grade. The option premium is the result of several economic effects. First, 
of course, there is a chance for falling interest rates. This effect is the same for all 
debtors. Second, for debtors with very low default probabilities the option premium 
is basically the premium for interest rate risk. Default risks do not play a role in this 
situation. Third, if default probabilities are increased mildly this leads to a greater 
chance that a debtor will default before the prepayment date and the prepayment 
right will expire worthless. This leads to a decrease in the premium. Fourth, for 
debtors with high default probabilities the risk costs will decrease considerably if 
they survive until the prepayment date. This has the effect that a debtor will prepay 
for sure almost regardless of the interest rates at the prepayment date in this case. 
All these effects are included in the option premium. 


18.4 Conclusion 


In this chapter, we have discussed an algorithm for pricing loans with embedded 
options. We have focussed on prepayment rights because these are the most popular 
embedded options in loan markets. However, it is also possible to extent the pricing 
algorithm to floating rate loans with embedded caps and floors. In a numerical 
example we have computed the necessary margin increase or the upfront premium 
depending on the way the prepayment right is charged by a bank. We have seen that 
option premia can be considerable and that these options should not be neglected 
when a loan is sold. 


SIn fact the option premia for rating grade 1 should be identical because there is no possibility for 
an upgrade of the debtor. The difference results from a numerical effect because the interpolation 
of default probabilities in the term structure leads to slightly different numbers than the exact 
calculation by transition matrices corresponding to year fractions. 
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The presented algorithm offers further possibilities of extensions. One key 
assumption in the algorithm was that the cost structure of a bank remains constant 
in time. The recent financial crisis has shown that this is not true. In times of 
financial distress the funding conditions of a bank can worsen considerably which 
leads to an increase of the margin of a loan. If the loan contains a prepayment right, 
however, the debtor might be able to refinance his loan at a lower rate just because 
of the reduction in banks’ funding conditions when markets went back to normal. 
By modifying the cost assumptions in the algorithm, this effect can be included in 
the option premium. 

Finally, the algorithm can be used for the risk management of loan portfolios. It 
can be used for the calculation of general loss provision that was already outlined in 
Chap. 17. Furthermore, it can be used to hedge the embedded options by market 
options like interest rate caps and interest rate swaptions. The pricing model will tell 
the amount of hedging instruments needed by calculating the so-called greeks (delta, 
gamma, vega) based on information about the current market prices of interest rate 
options (included in the model parameters of the G1 model), the default probabilities 
of the debtors, the migration probabilities, and the product structures. In addition, it 
offers the possibility to model irrational behaviour of debtors. Therefore, it includes 
all information that is needed from an economic perspective and still results in a 
tractable model that can be implemented efficiently. 
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